Digg Blog

Rethinking Notifications with Data Science

This post was written by Betaworks lead scientist Suman Deb Roy and was originally posted on Medium

How Digg Bot finds stories for your favorite topics

A year and half ago, the Notifications Summit was held at Betaworks to deliberate on many key ideas: the push and the pull, notifications as a primary interface, as a meta-app, utility of the lock screen, deep linking, filters etc. There was growing consensus that notifications could become an operating system for the information age, a beacon in the attention economy.

The attention economy has transformed many industries, but none more severely than news media — where a clear oversupply of information has overwhelmed consumers. The larger an information landscape becomes, the more pressing is the demand for actionable and relevant content. This hyper-relevancy is the principal challenge notification systems face.

Somewhat counter-intuitively though, it is only by monitoring and analyzing this entire information landscape that great notifications can be created, because only then can relevance be calculated as a synergy between the world and the user — an elusive attribute of actionable notifications.

Digg Bot notification for the topic “bitcoin”. If you subscribed to this topic, notifications about it might appear on the lock screen (left) and in Facebook Messenger (right) when you open it.

Luckily, Digg has data of the entire information landscape. Each day, Digg aggregates almost 7.5 million unique urls through its various products: Digg Reader which tracks 8 millions of RSS feeds, Digg Deeper that listens to 2–3 million Twitter users and Digg Channels comprising of focused topic pages. This means Digg observes a comprehensive chunk of media produced on the Web every single day, giving it unique potential at notifications technology.

In this post, I’ll explain how we are thinking about notifications at Digg using our messaging services, including topic subscriptions in the news bot, algorithms and heuristics that generate notifications and some results/data we are seeing from this feature.

DiggBot’s Notification Feature

We soft-launched Digg alerts on our Facebook Messenger bot on August 2nd, 2016. Since then, Digg Bot has sent over 34,037 notifications for hundreds of unique topics or keywords to users. Subscribing to a topic in Digg Bot is relatively easy. Just search for any word/phrase and the last card in the carousel will let you subscribe to it.

Alternately, you can add/edit/remove topics from your subscriptions at any time by typing manage subscriptions. When you add/follow a topic, you might receive push notifications comprising of important stories in the topic.

While you can follow traditional beats like politics or technology, the real value of a notification system is in more granular topics, which could range from obsessions like climate change to entities like beyonce or tesla. As an example, I subscribe to artificial intelligence news and these are some notifications Digg Bot sent me.

Notifications for “Artificial Intelligence”. If an important story in your topic breaks after 9pm or before 8 am, we might send them as silent pushes.

You can also subscribe to even finer sub-topics within concepts like artificial intelligence, e.g deep learning. Feel free to track specific entities related to sub-topics as well, such as the company Deepmind that is related to AI. Digg Bot’s algorithm adjusts itself based on the volume and velocity of stories associated to the topic’s generality and sends relevant pushes featuring a representative link related to the topic.

The coolest thing about a notification system is the ability to set up granular alerts about sub-topics. Instead of subscribing to all NBA news from ESPN, you could just get notifications about the Golden State Warriors. Instead of being bombarded with financial news from one publisher, you could configure Digg to notify you about certain companies only.

Digg’s Notification Algorithm

To generate relevant notifications, we must first calculate how pertinent a story is to the user at that moment. This depends on three factors — (1) how important the story is globally, (2) importance of the story in the user’s own world, and (3) time and attention-impeding capacity of an alert. While the first factor can be handled by editors efficiently, in reality, people don’t always care about everything newsrooms want them to care about at that very moment — because urgency is a deeply personal thing. Thus, factors 2 and 3 are hard to balance without intelligent technology.

Time is an inescapable attribute of intelligent notifications. Unfortunately, many popular machine learning solutions begin to wobble when we introduce this exact criterion into the equation — time. Features that appear paramount in static analysis of systems can get eroded when the same system is observed dynamically.

A singular ML framework can be hard to personalize in this regard, because the algorithm needs sophistication to model temporal variations of human attentiveness to news and information. Thus, there are three keyalgorithmic ensembles we employ to address this:

1 . The Trending Ensemble: A group of algorithms that determine the trending nature of a story, characterized by how much attention it is receiving in the social and news media. It is optimized for multi-modal signal monitoring, early detection, and considers accumulative opportunity cost plus seasonality.

The result is every article ingested gets a DiggRank, indicating its trending nature in the world. You can check the current trending articles in Digg Bot.

2. The Clustering Ensemble: Multiple learning algorithms that determine if two separate news articles are part of the same story /event. This addresses a regular irritation with news alerts — duplicate pushes from different outlets about the same story. The clustering ensemble is optimized for detecting consolidated media coverage, diversity and syndicated associations. The result is that all links covering the same story are grouped together in a cluster.

When news about Youtube’s live-TV service broke, about 10–11 media outlets covered it. This gif shows how all those related stories from different publishers were clustered and displayed in Digg’s technology channel.

The clustering ensemble also manages three important situations:

Story Development: As more media outlets write about a story and it develops, the semantics of article titles and descriptions change (if there is new information) — causing the cluster to split. The algorithm determines if the fresh articles in the news cycle is different enough to represent a story update and big enough to be pushed eventually.
Unverified Trends: This addresses a significant hassle in the age of breaking social news — the popular yet unverified story. Recall that last year, a single fake news story triggered safety alerts on Facebook. Some of the best information systems might be vulnerable to media hacking. Thus, consolidated media coverage (via clustering) is a heuristic for verifying hoax stories.
Editorial Expertise: The algorithm has to select one article from the cluster of similar links to be featured in the push notification. If there is a link in the cluster that Digg editors have featured on the front page, it could be prioritized as the representative article of the notification.

3. The Info-Sphere Ensemble: Just because a story is alert-worthy, does not mean it needs to be pushed now. Untimely pushes create ambiguity and a wrong sense of urgency. The final ensemble is a policy network — whose job is to determines if we actually push the story to the user right now or defer it to a later time, given a story’s importance.

The info-sphere ensemble attempts to simulate the information sphere of the user. A user can be subscribed to multiple topics of different granularity. Since the volume and velocity of incoming news for every topic is different, notifications must be modulated. Has the user recently received an alert about this topic? How many total notifications has she received in the last x hours? How surprising is it for stories in this topic to gain this much traction? On average, an individual subscribes to 4–5 topics. These questions are critical in assuring relevant yet non-invasive notifications.

Using these ensembles, Digg Bot has been flagging ~200 stories each day as alert-worthy, although we are noticing the aggregate number rise as more people keep subscribing to newer topics.

The overall number of notification alerts that Digg Bot flags each day. The algorithm went through a tuning spell from Aug 04–10, 2016 right after launch, which is why there was a huge spike and then trough. Tuning involves calculating the right thresholds and parameters once a system goes live, based on volume and velocity of incoming topical stories.

These 3 ensembles collectively give rise to some interesting flavors of notifications, depending on the topic categories you subscribe to.

Flavors of Digg Notifications:

(1) Mix of Breaking, Note-worthy, and Catch-up stories

We cannot emphasize enough the time-horizon of predictions or pushes that make alerts useful. Our priority isn’t necessarily to make notifications breaking, unless absolutely necessary. Instant is not always the best. Thus, the algorithm also calculates whether some topic stories are important but not big enough, so you can catch up with them in your “time-out” hours. This we call — the Digest.

The Digest comprises of top-ranked stories from a subset of your topic subscriptions. The topics chosen for push depend on the popularity of the stories within the topic and the frequency of alerts in that topic. For example, if you subscribed to Westworld (the TV show), these are some notifications (separate and digest) you would have received.

One of the many algorithmic tunings is to determine when something is breaking vs. socially popular vs. can be sent out in a digest. We understand that normal capability for media consumption (even for topics we are passionate about) varies but is possibly limited to pockets of time.

(2) The Obsession Stream

One of my favorite things to track is sports teams. But unlike traditional services that notify us about scores or high-level topic news like NFL, I want to receive all relevant news at a much more granular level, like SEC football or golden state warriors. This liberates me from following multiple services or receiving irrelevant noise about the entire beat.

For example, I follow Real Madrid— these are some notifications I received.

As you can see, alerts about sports teams or players can cover different attributes: new contracts, transfers, injuries, player awards and even amusingly popular memes.

(3) Instantaneous & Incidental

While I am ok to receive certain topic stories later in the Digest, other news pieces must be known in the moment. Certain topics, especially those related to sport teams, players, celebrities or companies, have an element of live in them. Reminding/informing users about critical events during a game or perhaps an earnings call stands out as a much beloved feature.

Here’s some notifications for Real Madrid with a live component in mind:

Sometimes, we forget when a game is about to start. I also like to be passively reminded of the score for a game I missed, instead me explicitly going online to check for it.

(4) Non-Invasive yet Noticeable

Occasionally, your tracked topic stories won’t be big enough for mainstream newsrooms to cover, but could be huge within your own world. An algorithm must decide which of your topics have big enough stories to tell — and when.

We realize you don’t always have free time to consume media, but the best technologies require the smallest amount of attention. For example, assume I follow the topics bots, artificial intelligence, real madrid, data science, westworld , etc., — how can it be compiled to consume later?

Digest is a notification that makes use of the carousel format and is sent during break hours — before/ after work hours for commute or during a potential lunch break. The goal is to sync with the diurnal fluctuations of our news consumption capacity.

Whats Next:

Digg Notifications is a synergy of three ensemble algorithms — the first ensemble proactively monitors millions of media signals, the second determines which signals are semantically similar, and the final ensemble personalizes the push based on socio-temporal patterns.

More concepts: We have been noticing a steady rise in the number of unique users subscribed to at least one topic. This also means the number of unique keywords Digg Bot sends notifications for is increasing.

Rising number of Digg Bot topic subscribers month-by-month (left). Growth in unique keywords that people are subscribed to (right). The highest subscribed keyword in every month is annotated in the chart.

Currently, 66% of subscriptions keywords are unigrams, 26% are bigrams and ~6% is trigrams. We noticed that multi-grams are sometimes names of sport teams, or blended concepts like apple vs google.

Tracking sectors: By using subscription topics intelligently, you can also track sectors of industry — such as tech companies, clean energy, celebrity news, sports leagues, political issues, manufacturing in Asia etc.

Notifications generated for 4 tech companies each day. Interestingly, we found users also subscribe to the content type/sectors via publishers names, e.g., subscribing to “tmz” to capture all breaking celebrity news.

API: Behind every bot functionality is an API. Digg’s notification technology is also available as an alerts API. You can subscribe to any company, person, or meta/hybrid topics and get alerts when something noticeable happens. The rate of alerts, ranging from always-breaking to always-digest, is easily customizable in the API based on your requirements. Additionally, you can turn off /customize notifications for individual topics at will in the Digg API.

In this age of limitless data, the goal of notification systems should not be to addict. Instead, it should help us live our lives better with the information we want. Notifications is a fundamental way to process infinite information, and will serve as the lowest layer of conversational intelligence.

You can subscribe to topics on Digg Bot here. For questions/ comments about the Notifications data or Digg Api services, please reach out to api@digg.com

Making the Switch from Node.js to Golang

This post was written by Digg Software Engineer Alexandra Grant and was originally posted on Medium.

I’ve dabbled in JavaScript since college, made a few web pages here and there and while JS was always an enjoyable break from C or Java, I regarded it as a fairly limited language, imbued with the special purpose of serving up animations and pretty little things to make users go “ooh” and “aah”. It was the first language I taught anyone who wanted to learn how to code because it was simple enough to pick up and would quickly deliver tangible results to the developer. Smash it together with some HTML and CSS and you have a web page. Beginner programmers love that stuff.

Then something happened two years ago. At that time, I was in a researchy position working mostly on server-side code and app prototypes for Android. It wasn’t long before Node.js popped up on my radar. Backend JavaScript? Who would take that seriously? At best, it seemed like a new attempt to make server-side development easier at the cost of performance, scalability, etc. Maybe it’s just my ingrained developer skepticism, but there’s always been that alarm that goes off in my brain when I read about something being fast and easy and production-level.

Then came the research, the testimonials, the tutorials, the side-projects and 6 months later I realized I had been doing nothing but Node since I first read about it. It was just too easy, especially since I was in the business of prototyping new ideas every couple months. But Node wasn’t just for prototypes and pet projects. Even big boy companies like Netflix had parts of their stack running Node. Suddenly, the world was full of nails and I had found my hammer.

Fast forward another couple months and I’m at my current job as a backend developer for Digg. When I joined, back in April of 2015, the stack at Digg was primarily Python with the exception of two services written in, wait for it, Node. I was even more thrilled to be assigned the task of reworking one of the services which had been causing issues in our pipeline.

Our troublesome Node service had a fairly straightforward purpose. Digg uses Amazon S3 for storage which is peachy, except S3 has no support for batch GET operations. Rather than putting all the onus on our Python web server to request up to 100+ keys at a time from S3, the decision was made to take advantage of Node’s easy async code patterns and great concurrency handling. And so Octo, the S3 content fetching service, was born.

Node Octo performed well except for when it didn’t. Once a day it needed to handle a traffic spike where the requests per minute jump from 50 to 200+. Also keep in mind that for each request, Octo typically fetches somewhere between 10–100 keys from S3. That’s potentially 20,000 S3 GETs a minute. The logs showed that our service slowed down substantially during these spikes, but the trouble was it didn’t always recover. As such, we were stuck bouncing our EC2 instances every couple weeks after Octo would seize up and fall flat on its face.

The requests to the service also pass along a strict timeout value. After the clock hits X number of milliseconds since receiving the request, Octo is suppose to return to the client whatever it has successfully fetched from S3 and move on. However, even with a max timeout of 1200ms, in Octo’s worst moments we had request handling times spiking up to 10 seconds.

The code was heavily asynchronous and we were caching S3 key values aggressively. Octo was also running across 2 medium EC2 instances which we bumped up to 4.

I reworked the code three times, digging deeper than ever into Node optimizations, gotchas, and tricks for squeezing every last bit of performance out of it. I reviewed benchmarks for popular Node webserver frameworks, like Express or Hapi, vs. Node’s built-in HTTP module. I removed any third party modules that, while nice to have, slowed down code execution. The result was three, one-off iterations all suffering from the same issue. No matter how hard I tried, I couldn’t get Octo to timeout properly and I couldn’t reduce the slow down during request spikes.

A theory eventually emerged and it had to do with the way Node’s event loop works. If you don’t know about the event loop, here’s some insight from Node Source:

Node’s “event loop” is central to being able to handle high throughput scenarios. It is a magical place filled with unicorns and rainbows, and is the reason Node can essentially be “single threaded” while still allowing an arbitrary number of operations to be handled in the background.

Not-So Magic Event Loop Blocking (X-Axis: Time in milliseconds)

You can see when all the unicorns and rainbows went to hell and back again as we bounced the service.

With event loop blocking as the biggest culprit on my list, it was just a matter of figuring why it was getting so backed up in the first place.

Most developers have heard about Node’s non-blocking I/O model; it’s great because it means all requests are handled asynchronously without blocking execution, or incurring any overhead (like with threads and processes) and as the developer you can be blissfully unaware what’s happening in the backend. However, it’s always important to keep in mind that Node is single-threaded which means none of your code runs in parallel. I/O may not block the server but your code certainly does. If I call sleep for 5 seconds, my server will be unresponsive during that time.

Visualizing the Event Loop: StrongLoop

And the non-blocking code? As requests are processed and events are triggered, messages are queued along with their respective callback functions. To explain further, here’s an excerpt from a particularly insightful blog post from Carbon Five:

In a loop, the queue is polled for the next message (each poll referred to as a “tick”) and when a message is encountered, the callback for that message is executed. The calling of this callback function serves as the initial frame in the call stack, and due to JavaScript being single-threaded, further message polling and processing is halted pending the return of all calls on the stack. Subsequent (synchronous) function calls add new call frames to the stack…

Our Node service may have handled incoming requests like champ if all it needed to do was return immediately available data. But instead it was waiting on a ton of nested callbacks all dependent on responses from S3 (which can be god awful slow at times). Consequently, when any request timeouts happened, the event and its associated callback was put on an already overloaded message queue. While the timeout event might occur at 1 second, the callback wasn’t getting processed until all other messages currently on the queue, and their corresponding callback code, were finished executing (potentially seconds later). I can only imagine the state of our stack during the request spikes. In fact, I didn’t need to imagine it. A little bit of CPU profiling gave us a pretty vivid picture. Sorry for all the scrolling.

The flames of failure

As a quick intro to flame graphs, the y axis represents the number of frames on the stack, where each function is the parent of the function above it. The x axis has to do with the sample population more so than the passage of time. It’s the width of the boxes which show the total time on-CPU; greater width may indicate slower functions or it may simply mean that the function is called more often. You can see in Octo’s flame graph the huge spikes in our stack depth. More detailed info on profiling and flame graphs can be found here.

In light of these realizations, it was time to entertain the idea that maybe Node.js wasn’t the perfect candidate for the job. My CTO and I sat down and had a chat about our options. We certainly didn’t want to continue bouncing Octo every other week and we were both very interested in a promising case study that had cropped up on the internet.

If the title wasn’t tantalizing enough, the topic was on creating a service for making PUT requests to S3 (wow, other people have these problems too?). It wasn’t the first time we had talked about using Golang somewhere in our stack and now we had a perfect test subject.

Two weeks later, after my initial crash course introduction to Golang, we had a brand new Octo service up and running. I modeled it closely after the inspiring solution outlined in Malwarebyte’s Golang article; the service has a worker pool and a delegator which passes off incoming jobs to idle workers. Each worker runs on it’s own goroutine, and returns to the pool once the job is done. Simple and effective. The immediate results were pretty spectacular.

A nice simmer

Our average response time from the service was almost cut in half, our timeouts (in the scenario that S3 was slow to respond) were happening on time, and our traffic spikes had minimal effects on the service.

Blue = Node.js Octo | Green = Golang Octo

With our Golang upgrade, we are easily able to handle 200 requests per minute and 1.5 million S3 item fetches per day. And those 4 load-balanced instances we were running Octo on initially? We’re now doing it with 2.

Since our transition to Golang we haven’t looked back. While the majority of our stack is (and probably will always be) in Python, we’ve begun the process of modularizing our code base and spinning up microservices to handle specific roles in our system. Alongside Octo, we now have 3 other Golang services in production which power our realtime message system and serve up important metadata for our content. We’re also very proud of the newest edition to our Golang codebase, DiggBot.

This is not to say that Golang is a silver bullet for all our problems. We’re careful to consider the needs of each of our services. As a company, we make the effort to stay on top of new and emerging technologies and to always ask ourselves, can we be doing this better? It’s a constantly evolving process and one that takes careful research and planning.

I’m proud to say that this story has a happy ending as our Octo service has been up and running for a couple months with great success (a few bug fixes aside). For now, Digg is going the way of the Gopher.

https://github.com/gengo/goship

The Cyborg Approach to Content Curation

For the past few years, the Digg team has brought you the stories the Internet is talking about. Starting today, you’re going to see us offering deeper coverage of select topics. We’re starting with the three that our users care about most: Technology, Entertainment, and Election 2016.

The Digg team makes a lot of the fact that we rely on people (or “human editors” as Apple might call them) to do the work of gathering excellent content. To do this, our engineers and data scientists built a suite of tools that allow us to filter through millions of articles, videos, and social data to curate the perfect front page. Smart editors armed with smart technology: it’s the cyborg approach to content curation.

When we engineered our new Technology, Entertainment, and Election topics, we developed a new system for gathering stories. We wanted a way to curate more content, faster. What we’re rolling out today is still “cyborg,” but it’s sort of reversed. Instead of our software enabling our editors, our editors are enabling our software by training algorithms to look at a massive flow of stories and signals to deliver a well-rounded snapshot of a topic. We’ve found that carefully introducing some automation enables our small, savvy edit team to cover more ground. We think this thoughtful combination of people and technology will deliver that Digg-quality mix of stories for a whole range of specific interests.

Check it out: Technology, Entertainment, and Election 2016. Look for the little ϟ icon. These stories are curated with this new approach.

Introducing Digg Dialog!

Over the last few months, we have been not-so-secretly preparing to introduce community and conversation back to Digg, and we are finally ready to pull back the proverbial curtain …

Introducing Dialog!

What Is Dialog?

Dialog is a thoughtful, live conversation between the Digg community and the people who make the best stuff on the Internet.

Digg’s Superstar Editors discover and feature a lot of incredible articles, written by awesome journalists, about amazing people. And we’ve always thought: “Wouldn’t it be great to talk about these articles with the journalists who wrote them, or the people they’re about?” And it turns out, Digg users also really, really want to talk about them, while journalists are clamoring for a well-lit place to share more about their work.

Here’s how Digg Dialog works: When we discover and feature an exceptional article (or video), we will invite the journalist or an expert to come talk with Digg’s community. If they accept, we will schedule a Dialog and post the time and guest on our homepage. A few hours before our guest arrives, the Dialog page will go online, and you will be able to start posting questions.

Once our guest joins, the conversation will officially go live, and we expect it to be fascinating, entertaining, spirited, and civil. After the guest leaves, the Dialog page will stay open so you can continue your discussions and debates with other community members. When the conversation dies down (or when the moderators feel it’s time), the Dialog will be closed, but the page will stay accessible (forever!) so you can relive and share your discussions.

Not Your Father’s Q&A

We have focused on features that will allow that most engaging and interesting conversations. Here are a few things that we think you’ll notice and love:

• It’s LIVE! - We built Dialog to feel like a live chat instead of static commenting, and you’ll notice it immediately. As the conversation is going on, new questions and responses will pop into your thread in real-time, no page refresh required. We think it’s going to create a truly free-flowing and dynamic experience. Hope you like it!

• Multiple Views - Dialog comes in three different views: “Live” (real-time and chronological), “Answered” (questions answered by our guest), and “Most Dugg” (questions and responses that the community diggs the most). After testing a lot of different views, we found these three to be the best for following a live conversation, and the easiest to read afterwards. All three options will remain available even after the Dialog closes.

• Creators and Experts - For our guest lineup, we’ll be focusing on people that have created great content and have something outstanding to share with you, so you’ll be meeting some amazing journalists and authors. We will also sprinkle in a few persons-of-interest, especially those who have been profiled in an article or video, or are experts in a topic that has been featured on Digg. Most Q&A products out there cater to people who are promoting a new movie, TV show, album, pet, or underwear line. Now, there’s nothing wrong with that, but we wanted to offer something different.

• Best Community - We believe in the quality and depth of Digg’s community, especially after so many of you told us how much you care about great conversations. To be honest, we know that it’s hard to build and manage a good commenting platform, and we’ve been watching as some of our favorite publishers have shut down theirs over the last few months. But with your help and feedback, we were able to create Digg’s Community Guidelines that match our shared expectations of quality and intent. And to start, we’re going to be pre-moderating Dialog to ensure that the conversations are interesting, thought-provoking, and civil.

• Cross-platform - Who are we to restrict where you have your conversations? We’ve built Dialog to work on web, mobile web, and in our new iOS App, which will be live in the App Store shortly. And for our Android App users (of which I am one), we’re working on it, and hope to have some good news soon.

When Can I Join The Conversation?

Our first Digg Dialog will be Friday, October 9 at 12pm EST, and our very first guest will be the inimitable Paul Ford. Make sure to read his latest article about Wikipedia from the New Republic, and join us tomorrow on the frontpage to talk about it.

We have really loved building Dialog, and have worked with our world-class launch partners (big and small, see below) to bring you something we’re really proud of. Come have a conversation with us!

- Gary Liu, Digg COO

Meet Veronica De Souza, A Human Person Who Works At Digg

Digg may seem like a cold, faceless company, but underneath the ever-shifting grid of content, lies humans who make it work. This week, we chat with Veronica de Souza, who refuses to commit to a title.

Meet Steve Rousseau, Digg’s Features Editor

Digg may seem like a cold, faceless company, but underneath the ever-shifting grid of content, lies humans who make it work. This week, we chat with features editor Steve Rousseau. Check out the interview here.

A Digg Approach To Online Conversations

So we’re getting close to launching Digg’s Next Big Thing Or Two — a series of products and features we hope you’ll find to be smart and interesting takes on Internet conversation, oriented around the stories and videos that make it onto Digg. (ICYMI, Justin Van Slembrouck, Digg’s Design Director, recently previewed what’s coming next.)

As Digg gets ready to welcome our users’ thoughts and observations and arguments and points and queries and comments and corrections and provocations and so forth, we need policies that strike the right balance between free self-expression and basic civility. Lots of companies have tread this ground before, so what follows shouldn’t seem new or revolutionary. Rather, Digg’s Community Guidelines are intended to embody a fundamentally moderate approach: to encourage very open debates, very free discussions, and very searching dialogue, while setting some straightforward limits that establish Digg as a place that values civility and mutual respect. Our aim here is to adopt a handful of clear rules that we will enforce in a predictable and reliable way. These guidelines will undoubtedly change as Digg’s conversational products expand, grow and evolve, and we’ll want, and humbly request, your advice and counsel along the way.

Without further ado, here’s version 1.0 of Digg’s Community Guidelines.

– Andrew, Digg CEO

Digg Community Guidelines

Digg is a place for lively conversation, discussion, inquiry, and debate. On Digg, you can – and we hope you will – offer an opinion, express a point of view, challenge a claim, ask a tough question, and provoke a response. We’re building features like Digg Dialog so that you can, for example, ask an investigative journalist to explain the deep background behind her reporting, debate the conclusions she’s drawn, and share your own knowledge about the subject.

Broadly, we believe in self-expression and the free and open exchange of opinions, thoughts, and ideas. We also believe in civility; a rudimentary level of respect is essential to sustain dialogue over time. To that end, Digg enforces a set of community guidelines.

Our aim is to have a small set of clear rules that we consistently and impartially enforce. We’d love your feedback and advice on how to write and implement them better.

What’s Not Allowed On Digg

Here are the categories of stuff we don’t allow users to post on Digg – in whatever form, whether words, images, avatars, or links to webpages that contain it:

Hateful abuse: Slurs, epithets, and hateful speech that denigrates people based on race, ethnicity, religion, disability, gender, gender expression, age, veteran status, or sexual orientation.

OK: “I respectfully disagree!”
Not OK: “I respectfully disagree, you dirty *#$%!”

Threats and calls to violence: Threats and encouragement to commit acts of violence against others.

OK: “This article on puppy abuse makes me want to throttle someone.”
Not OK: “This article on puppy abuse makes me want to kill someone, so let’s all d0xx, stalk, and assassinate the author.”

Abusive names: Usernames, screen names, bios, or avatars that are abusive, fraudulent, racist, demeaning, hateful, needlessly inflammatory, overtly sexual, or impersonating in a way that’s not obviously a parody. Basically, Digg’s username space should be rated G or PG.

OK: fakeDonaldTrump
Not OK: realDonaldTrump (unless you really are The Donald himself)
OK: AndrewMcLaughlin
Not OK: AndrewMcLaughlinSuxDonkeyButt (even if you believe in good faith that he does)

Gratuitous sexual explicitness: Words or images that are sexually explicit in a way that’s off-topic or out-of-context.

OK: [In a discussion on sex education] “It’s time we started teaching teens about blowjobs.”
Not OK: [In a discussion on North Korea] “It’s time we started teaching teens about blowjobs.”

Trademark and copyright infringement: Speech or links that infringe someone else’s legally protected trademark or copyright, recognizing that fair use is a fundamental element of free speech.

OK: Quoting a few sentences from an article to critique it.
Not OK: Copying and pasting 100% of someone else’s piece of writing without permission.

Spam: Off-topic, fraudulent, or deceptive commercial pitches.

OK: [In a discussion of shower beer] “Heineken?!? PABST. BLUE. RIBBON!”
Not OK: [In a discussion of shower beer] “Best price$ on \/1@GrA!”
OK: “I recommend this article on the unpredictability of bovine supinity.”
Not OK: “I recommend this article on the unpredictability of bovine supinity.”

Harassment: Repeatedly attacking, bullying, or trolling someone in a targeted, ad hominem, or otherwise over-the-top way.

OK: “You are mistaken.”
Not OK: Following someone around and attacking his/her every comment, or attacking him/her every time s/he makes a comment. “You are an idiotic and worthless waste of biological mass.” “You are an idiotic and worthless waste of biological mass.” “You are an idiotic and worthless waste of biological mass.” “You are an idiotic and worthless waste of biological mass.” Etc.

Privacy violations: The posting of someone’s personal or private information.

OK: Referencing someone by their screen name or Twitter handle.
Not OK: Referencing someone by their credit card number.

Illegal speech: Illegal content of whatever sort, like fraud or phishing.

False flagging: The abusive mis-flagging of comments or users by falsely claiming violations of these guidelines.

What Happens When We Find Something Not Allowed

We will take down any comments or posts, and disable any usernames or screen names, that violate these guidelines. When we take action like that, we will attempt to notify the person who posted it, via email or our web interface, to give her/him a chance to fix the issue or to argue back in case we’re getting it wrong. Where someone repeatedly violates the guidelines, we may restrict, suspend, or terminate the account.

We will do our best to be abundantly clear and consistent in enforcing these rules, to be open and honest about it, and to honor the guidelines in their implementation. Of course, as our lawyerly besties have advised us to state explicitly, we reserve the right to enforce, or not enforce, our community guidelines in our sole discretion. These rules create no duty or contractual obligation for us to take any particular action.

What Digg Is Building Next

For the past few years, we’ve been describing Digg as “What the Internet is talking about right now.” Today, all that “talking” happens in countless places all over the web, but it’s consolidating on an increasingly smaller number of networks. Facebook, for example, can be a good place for conversation, but it’s also a good place for vacation pics and birthday wishes. Twitter is the best for breaking news and realtime chatter. Reddit can be a great platform for underserved voices, other times… well. We think there’s space for conversation on Digg – conversation that’s focused on stories (or “links” if you’re old-school). A place for discovering great content and a great place for conversation. That’s the next phase of Digg.

We’re approaching conversations on Digg the same way we did the original Digg relaunch three years ago (our CTO, Mike, recounted the whole epic tale in his post.) Our goal during that six-week sprint was to launch a new Digg that was full of great links. We set a high bar for what could make it on the front page. For conversations, we’ve got the same high bar. It’s gotta be good.

We’re planning our first iteration of conversations on Digg this fall. Here’s an outline of the features we have so far:

Conversations will be based around a story
Not just about anything or anyone
The author(s) of the story are encouraged to participate
(Or “content creators” if you’re new-school)
It’ll be open and high-quality
Our aim: conversations that are just as interesting as the stories themselves
Clear community guidelines will be in place
We’ll define these in our next blog post
We’ll be moderating
It’s definitely not our long-term goal to moderate every comment, but we’re going to err on the side of civility to start
You can digg comments
We want help to surface the best of the community

We’ll be revealing more in the coming weeks about what this will look like, what we’re going to call it, and when exactly it launches.

-Justin, Design Director at Digg

If you’d like to get more updates like this on what we’re building, sign up for email updates.

(Gif via Diagonal View/YouTube)

The Life Of A Link On Digg

For those of you just tuning in to our on-going blog series, check out our two previous posts about how the new incarnation of Digg came to be, and user feedback on what Digg should become.

To some Internet denizens, the method by which the front page of Digg comes to be is still a complete mystery. Gone are the days of power users — the digg action has been rendered impotent at the feet of all-powerful “human editors.” Or so the story goes. But of course, that’s not the whole story.

I’m here to lift the curtain a bit on exactly how a story lands on the front page of Digg. Who am I, you ask? Allow me to introduce myself: I’m Anna Dubenko, the Editorial Director of Digg and the person leading a team of six full-time editors who scour the best of the Internet just for your benefit. Here are their beautiful faces, illustrated to protect the identity of the innocent.

Below, for your edification/enjoyment, a step-by-step breakdown of the life of a link on Digg:

How The Heck Do Editors Source Their Hot Content?

There are a number of ways a story can get the attention of a Digg editor.

1) RSS
After three years of scouring the web for the best stuff, we’ve acquired a pretty robust RSS list. As of this writing there are 3,601 unread items sitting in my Digg Reader, just waiting to be read, evaluated and selected for the front page treatment. I have trained myself to not get anxious about this number.

2) JustTheTip@digg.com
Did you just publish your magnum opus? Is your YouTube video the next Sizzler sizzle reel? This very special email address is your direct line to a real, live Digg editor. We’re giving it to you because we trust you. Don’t abuse it.

3) Data.digg.com™ or as a predecessor like to call it, “Proprietary Social Scoring Algorithms”
This is one of the tools our crack squad of data-scientists and engineers hacked together that’s become indispensable. It transforms the cacophony of Twitter into a manageable list of stories scored by social signals. Until now we’ve kept this tool relatively hidden, but we figured, what the fuck? In the spirit of sharing the riches of the Internet, we’re opening up our data mine to the coal workers of the web. Digg through it with caution. (See what I did there?)

4) Friends, family and insignificant others.
This category is reserved as a catch-all for the stories our cool aunts email to us from a local newspaper, or the great video our friend from college uploaded to our Facebook timeline. It may be a truism in this digital age of media, but it bears repeating that social networks (dark and otherwise) fuel an incredible amount of discovery and traffic — even for readers as savvy as Digg editors.

5) Our very smart brains.
We got into this business because we love the news. We love reading great journalism and discovering the coolest, most interesting stuff the Internet has to offer. And there’s nothing that makes us happier than discovering and promoting a small blog or under-subscribed YouTuber.

What Happens To A Link Once It’s Been Selected By An Editor?

The driving purpose as editors at Digg is to promote high-quality content — to send precious clicks away from our own site to deserving publishers and video producers. As such, we add a little window dressing to a story to make sure that we’re giving you, the reader, the full pitch.

We might change a headline (no offense to the original writer):

Our story:

Their story:

We also might change the image — as we did in this example — or the description of a story to make sure it works for our audience. You know, different strokes for different folks.

What Are Those Little Captions Above The Headline And Why Are They So Gut-Bustingly Hilarious?

They’re called kickers. You can follow them on Twitter here. We occasionally use this space as way for editors to communicate important information to our readers (when a story is developing or breaking, for example). We mostly use this space to exercise our pun muscle. Here are two of my all-time favorite kickers.

**Do Digg Editors Just Curate Other People’s Content? Don’t They Write Anything Themselves?**

Glad you asked! We’ve been doing a lot more than just curating the best of the web for the front page of Digg. In fact, there’s a tag for that!

Dan Fallon has been summarizing long-form articles for you in his weekly Long-Reader’s Digest column. Steve Rousseau, apart from performing editorial magic as our Features Editor, brings you the best/weirdest news of the week in his recurring series: What We Learned. There’s also Bryan Menegus who, besides running the video page, also creates original video for Digg and puts together Videos of Substance for you to watch and feel good about. If politics are your thing, check out Ben Goggin’s Enthusiasm Gap, a weekly roundup of the circus that is political life (complete with a weekly conspiracy theory). Veronica de Souza, our Social Media Editor, performs the impossible and entices our Tumblr audience to leave their dashboard and come to Digg. To find out how she does this, read this post. If you’re a cool teen, a) thanks for reading this whole blog, and, b) check out what most certainly can be called works of art created by Joe Tonelli for Digg’s Snapchat account.

TL;DR: We’re human and we’re proud of it. Digg editors are here to bring you the best of the Internet and highlight the stuff that most robots would miss (low batteries are such a bummer). Pretty soon, we’ll put some of this editorial power back into the user’s hands, but that’s a blog post for another day…

If you’ve got any feedback for me or my team, I’d love to hear it! Leave a comment below or write to me at anna@digg.com.

-Anna

How We Ended Up With Digg

Many of you might be too young to remember this photo:

It was taken when Digg was the top dog on the Internet. It had a healthy community that shared interesting content. There was robust discussion around a wide range of topics. A lot has happened since then. And the death of Digg has certainly been discussed in comment sections across the web.

Digg still exists?

If we could sum up how a lot of people think of Digg now in one tweet it would look like this:

Yes, Digg still exists, but under new management. Here’s how that happened.

Let’s travel back in time to the summer of 2012. My team and I were working on a project called News.me. We started News.me as a prototype at the New York Times R&D lab a few years earlier but took the project to Betaworks to work on it full time. News.me had grand ambitions, but ultimately became a very simple product - a daily email that would give you the most-shared articles on your Twitter stream.

We were two years into working on News.me but had less than 100k users — which is not enough for a company. The team was in the planning phase for what would have probably been our last shot at a new set of features.

Then, in early June of 2012, John Borthwick, the CEO of Betaworks called me in to a conference room.

“What do you think about taking over Digg?” he asked.

“Digg still exists?” I asked. “Wow, you are nuts!” I remember thinking.

As John was talking about the possibility of taking over Digg, I was picturing the past few years of Digg’s existence. After the infamous “v4” launch in 2010, users revolted and left for Reddit, Twitter, Facebook and other places along the internet.

In that meeting, John talked about how strong the Digg brand was. He said we were going to make an offer to buy the brand (once worth $200 million), a Twitter account with a million followers and whatever audience was still left using Digg. The more I thought about it, the more excited I got. I mean, how many opportunities do you have to try to revive a once thriving then dying tech brand? Even if we burned it to the ground, it would still be an incredible challenge and opportunity.

I quickly went from “WTF” to “Yes, lets do it!”

There was only *one* caveat: We only had six weeks to rebuild it! At the time, Digg had 700 physical servers/computers used to power digg.com in a data center somewhere in Bay Area (remember, Digg came about in a pre-Amazon AWS world). In exactly six weeks, their data center contract was going to expire and we’d have to take over the contract at about $200,000 a month.

We had a few options: Leave the site up for a month or two and pay the $200-400K to keep it going while we built the new site OR build whatever we could in six weeks and flip the switch over to the new site at the last minute. Paying an extra $200-400K didn’t make sense (and we like a challenge!) so we chose to rebuild the thing in six weeks.

Building the new Digg

As excited as I was, I kept asking myself what I had just signed up for. Our to-do list was massive. We needed to build:

A whole new site
An iphone app and mobile web site
A working Content Management System
A way for Digg users to download all their old diggs, comments, submissions

Because of the six-week deadline, we had to scrap all the community features. We felt we should shelve those features until we had more time to figure it out. (Hint, hint: The time has come!)

The deal wasn’t yet public so we had to keep it a secret, which is where our codename came in. We’re all big Arrested Development fans, so we called Digg “the banana stand.”

“There’s always money in the banana stand.”

Our lawyers had a great sense of humor. As we were busy building, they were handling the sale of Digg and created a new company for us called “bananastand inc.”

In order to get this done in six weeks, we had to build up our team. We hired an editorial team and a few crack engineers and went from six to 13 people in two weeks. We had our work cut out for us. Here’s more or less how we spent those next 42 days:

(This chart is definitely not approved by our design director)

On July 31st, at the 11th hour there were still a number of bugs in the new Digg but they were going to power off the old Digg in a matter of hours. There was no going back, so we put the new Digg out and flipped the switch to start sending traffic to it.

Quite the transformation, right?

How’d we do?

With the new Digg live, we were anxious to get feedback on what we had built. Besides being featured in the Apple App Store and on Time’s list of “50 Best Websites for 2012”, we were most proud of what people were saying of the new Digg:

Doesn’t suck? Surprisingly good? We’ll take it!

We’re very proud of what we were able to accomplish in just six weeks. However, we did lose some things in that rush:

Devalued the digg action

Spammers had essentially taken over the old site and were using the digg action to move their shitty posts to the front page. We put an editorial team in place to place stories on the page to cut down on spam, rendering the “digg” useless.

Killed the community

A lot of old Digg users left the site after the infamous v4 update. We wanted to focus on good content first and give ourselves time to figure out how to mix the community back in.

Disabled commenting

The conversations that were taking place on Digg weren’t interesting or helpful. Much like the other community features, we needed time to figure this out.

Where we are now

Since we took over Digg in 2012, we’ve made several updates to the front page, built Digg Reader and brought some of the News.me features over to Digg as Digg Deeper. Our editorial team has been busy finding the best and most interesting content on the Internet, and sometimes writing it themselves. In the past few months, we’ve been working on bringing community features back to Digg, which are coming soon. We’d love to hear your thoughts on this. Leave us a note in the comments below or email us: feedback@digg.com.

-Mike

We Asked, You Answered

Hey, I’m Veronica. I’m part of the team that’s bringing community and conversation back to the new Digg. You might be wondering what we’ve been up to. You might not. Either way, I’m going to tell you! When we rebuilt and relaunched Digg three years ago (ah, #tbt!) we shelved pretty much all of the community features. Our sense was that Digg’s community had almost entirely disappeared. We decided to strike out in a different direction, until the time seemed right to bring conversations back in. A few months ago, our designers and developers started to build.

In the past couple of weeks, lots of people have been talking about what makes a good (and bad) community. For our part, we threw some pointed questions out to Digg’s users, especially since we’re in the middle of building community features. Last week, about 1,500 of you were nice enough to fill out a short survey. The feedback was incredibly helpful and interesting.

Here are a few things we learned from the survey:

- 70% of you are interested in commenting on Digg.
- The second-most-requested feature was commenting and discussion across all Digg users, not just existing connections imported from Twitter and Facebook.
- You want to follow topics.
- You love reading about tech and science.

To me, the question that produced the most interestingly overwhelming response was this one:

In a lot of the Reddit coverage, and in the broader discussion about community and commenting, there’s been an increased focus on the connection between “free speech” and the integrity of a community. What we learned from our survey is that the integrity of community starts with clear community guidelines. What we heard from you is that reasonable rules aren’t a problem, but clarity and consistency of application matter a ton. These are the kinds of things we’re thinking about a lot as we build these new features.

A bunch of you left us some notes at the end of the survey. I’d like to address a few of them:

“If you do go through with creating a comment system, I really urge you to focus on a system that puts an emphasis on a few high quality comments and discussions rather than a raucous free-for-all.”
This is something that we’ve heard a lot! We want to encourage interesting, thoughtful and smart conversation. This is why we’re building something that is not too invasive or in-your-face.

“I would be careful when changing anything on the homepage–the simplicity and layout really work!”
Rest easy, my friend. The front page will remain the same for users who’d rather not participate in a Digg community.

“I love digg, no matter what my friends say.”
Who are your friends? Tell me. Tell me who said mean things to you.

“The propensity to present stories (the repeated “Attack on Titan” trailers come to mind) that seem to be of interest to the Digg editors but it’s not actually “what the internet is talking about.” It feels like an agenda is being forced upon us.”
I’d like to take this opportunity to state that Digg is strictly pro-anime. But seriously: Yes, we love Attack On Titan, but our numbers show that our users do too! This is why we keep posting new trailers. Our editorial team uses a ton of data signals (in addition to their very smart brains) when selecting stories for the front page.

This conversation is far from over. If you missed out on this survey, we still want to hear from you. Join our beta tester list to get updates on our progress. Did we miss something in the survey or in this post? Should we be watching better anime? Do you just want to talk to someone? Email us: feedback@digg.com or leave a comment below.

-Veronica

We’re Building Out The New Digg. Wanna Help?

So the Digg team is working feverishly to design and build a wave of new features and capabilities. We’d like your help! If you’re interested in weighing in, giving us your input and guidance, testing beta products, or just keeping tabs as things get ready for launch, please join Digg’s Beta Testers list. For starters, we’ll send you a basic survey asking questions about how you use Digg and what you’d like to add. (We’re also asking, optionally, where you live, as we’d like to organize a couple of meetups with interested folks).

It’s time to add conversations, dialogue, and social features back into Digg, and we want to do it in the right way – with your input.

Thanks!

–Andrew

What Are We Doing On Tumblr?

For the past few months my lovely coworkers have been asking me to write a blog post about how Digg uses Tumblr because our blog has been ~*killing it*~ lately. I mean, look at it. I ignored these emails for as long as I could, but here we are.

In 2012, Digg was essentially a one-page site. There were no landing pages beyond the front page, so the point of social media — sorry in advance for all the buzzwords — was to get Digg homepage content in front of unique audiences on different platforms. Actually, that’s still the point of social media at Digg. Sure, traffic is great, but for now it’s an added bonus since we still link out to other websites the vast, vast majority of the time.

So what does this have to do with Tumblr? Everything, actually.

When I started our Tumblr blog two years ago, there was no strategy. I just thought: I am a person who uses Tumblr. If Digg were a person, how would Digg use Tumblr? That’s it. There were no meetings, no analytics, no corporate meddling. Just a blog. And while no, it’s not responsible for a huge percentage of our traffic, we do have a nice following and have had a ton of awesome interactions with the beautifully insane Tumblr community. It’s been a little over two years and I’m happy to report that not one person at Digg has told me that, despite it not being a big traffic driver, Tumblr is not a worthwhile use of my time.

In February, Tumblr accounted for just a fraction of our overall traffic (still more than Twitter!), but that traffic is not the only way to measure success, especially with Tumblr. Instead, like most Tumblr users, I measure success in likes and shares. In fact, one of the luxuries of working at an awesome place like Digg is that using your time to grow something that won’t ever lead to an increase in traffic is totally okay.

That brings me to the reason for even writing this blog post. I’m supposed to talk about what’s “working” for Digg on Tumblr. I can’t even count the number of times I’ve been asked by people how to “hack” or “game” Tumblr. There are a few publishers like Mic, NPR and Buzzfeed (off the top of my head) that have awesome blogs and this is solely because the people who run them are Tumblr users themselves. They understand the language of the ever-changing Tumblr community. So, no, I don’t have any Tumblr “hacks” but I can share a few things that have worked for us with no guarantee that they will work for you:

1) Make sure the person or people who run the blog actually use Tumblr. And I mean really use it (not just say they use it because they signed up for it once). My colleague Joe and I split this responsibility.

2) Screenshots of headlines coupled with GIFs to tease a video (or just screenshots of headlines with no GIF) is the main way we’ve been able to lure users away from their precious dashboards and onto Digg. But again, we consider this an added bonus, not a goal.

3) Have a personality and a sense of humor, goddammit.

4) Tumblr’s explore page just got a nice makeover. Tagging your posts correctly will help people find them. People have also used tags as an extension of the posts themselves. Like this.

5) Take the time to follow real people and reblog cool shit. It’s not all about you!

That’s it! Those are our tips, hacks or whatever else you want to call it. Know your audience and respect the awesome people that make up the Tumblr community before you try to wrangle them up and make them click on your damn links!

<3 Veronica

Introducing Digg TV

A little more than a year old, Digg Video is pretty great. From out of the endless weeds of the Internet we have cultivated a wonderful garden of videos you actually want to see, from timely and important news pieces to bizarre endeavors plucked from uncanny corners of YouTube and beyond.

But what if you want to sit back and enjoy Digg’s expertly curated videos but you DON’T want to get popcorn butter all over your keyboard?

That’s what Digg TV is for.

Videos on Digg now come with a TV Mode button:

That button pulls the video up into the Digg TV player:

Sit back and enjoy your full-screen, autoplaying video experience, organized by channel or your own collection of saved Digg videos.

When you click on Explore, you’ll see the full range of Digg Video’s human-curated, topical channels:

Digital video is getting better — the viewing experience should be too. Most videos are presented amidst clunky interfaces or inline in articles, and, even if you go fullscreen, when your video is over you need to hunt for the next. YouTube playlists are fine, but run up against the exact problem that Digg is built to solve: there is a lot of crap out there. Can someone please curate the good stuff? Thanks for asking, Internet. We can. We did. We have some suggestions.

When you’re feeling serious, check out Documentary, Short Film, or Science. For laughs, click Funny or Cute. When you’re hungry, Food. When curious, try Curious. Feeling handy? How-to. Retro? Histories. In a creative mood? We’ve got videos on Architecture, Art, Books, Culture, Design, and Photography. Need for speed? Cars or Aviation. Looking for action? Sports. (Or Lust). Feeling love for your fellow humans? Cities. Sick of your fellow humans? Nature or Animals. Too tired to type “TMZ” into your browser? We’ve got excellent videos about Movies, TV, and Fame, along with great clips from Late Night. When you’re awash in self-loathing, we recommend Gross. (Or Politics). And when you’re feeling unable to escape how everyone still treats you like an adolescent, perhaps a dose of Animation or Comics.

Watching video online is rapidly chipping away at the mammoth time sink of humanity that is Big Media-provided broadcast television. According to Nielsen (report), over the last year alone millennials watched 20% less traditional TV and 33% more digital video than just the year before. (Whoa, right?) We want to help accelerate that trend by making it much easier to find, and more fun to watch, the videos Digg finds and curates.

Digg TV is still in beta. Tell us what your greedy little eyeballs desire. Some features that may come depending on how much you want it:

· Use your phone as a remote control
· Support more video sources (currently only YouTube and Vimeo are supported)
· Save videos from anywhere on the Internet and watch them on Digg TV
· Integrate with Apple TV, Chromecast, and other streaming dongles and boxes

Digg already has the entire Internet running through our blood. There are many excellent videos out there. Witness them in Digg’s new TV mode.

Digg By The Numbers, 2015 Edition

An Exercise in Dig(g)ital Corporate Nudity

At the start of a new year, it’s tech company tradition/neurosis to do a swan dive into the previous year’s data in search of sunken treasures — patterns or insights that escaped notice in the daily rush of site metrics and KPI reports. I’ve been doing that over the past week, and thought it might be interesting to lift Digg’s hood and show some of our internal numbers from 2014. TL;DR: Digg had a terrific year, accelerating as the months progressed.

By way of background, we’re pursuing an unconventional strategy at Digg. Armed with a data infrastructure and some smart algorithms, Digg has editors — actual human beings — curate the most interesting stories and videos on the Internet and deliver them, as an act of judgment and with some degree of wit, to our users via our homepage, our iOS and Androidapps, the Daily Digg email, and a range of social channels. We think of our users as those who speak Internet — people who spend a fair amount of time online, who are curious, and who love news and great writing, as well as eye-catching videos. In short, those who want to learn and explore, and who enjoy both the highs and lows of Internet culture. For these people, the Internet is both wonderful and often utterly overwhelming — an endless scroll of stories, videos, blog posts (e.g., the one you’re reading), tweets, status updates, infographics, shared photos, gifs, alerts, and so on. Digg’s mission is to make sense of it all, to distill that vast daily river down to its most interesting and noteworthy gems. Tackling that problem from multiple angles, we also build awesome automated tools like Digg Reader and Digg Deeper that help our users navigate, manage, and make sense of their Internet.

It’s an amazing era for readers and watchers of creative work on the Internet. There are so many fantastic sites and apps generating great writing and compelling videos. But amid the resulting clamor for audience and attention, we’ve seen plenty of less edifying behavior — clickbaiting,churnalism, cut-and-paste repackaging of others’ work. Those sites tend to crash and burn.

We’re building Digg for the long term. We believe in quality. We’re making a bet that the future lies in driving attention by being smart and useful, not conjuring or regurgitating linkbait. Our goal is to send users out to what’s interesting, without cynicism, trickery, or favoritism. So we’ve been asking quantitative questions about the user experience at scale: How many people use Digg, our homepage, the apps, email, Digg Reader, Digg Deeper, social, etc.? How long do they spend on Digg, which parts, and how often? What do they read, what do they click, what do they ignore? When are they active, and where? Why are they at Digg, and what can we do to make it a more useful and enjoyable experience?

On to the 2014 numbers.

The Basic Number: Monthly Active Users

Let’s start with our most important metric. Here’s a graph that shows the growth in monthly active users (MAUs), quarter by quarter since Digg was relaunched by betaworks:

These are MAUs across all our channels (web, mobile, mobile web, email, social), as best we’re able to accurately account for them. We do a quarterly average in order to iron out seasonal changes and determine the overall trend. The Q1 2015 number is based on January performance.

While I’ve been happy to see strong, steady MAU growth over the year, I’m especially happy that the growth has been organic. While all our channels have been scaling, what’s notable is that direct traffic continues to dominate over search, social, and referral. Even at current scale, around 70% of our web traffic comes direct. This is a testament both to the Digg brand and to our ability to build products that users want to share with their friends. Digg in 2015 represents what the Internet is talking about, and a lot of people type our domain into their browsers and search engines each day. This was all accomplished without marketing: other than a few thousand dollars spent on one-off Facebook experiments, growth has been organic.

Social traffic

We get a lot of inbound traffic from Facebook and Twitter, and we drive a lot of activity there too. In just the last week, for example, about 600,000 Facebook users liked, commented, shared, or clicked on our posts. Since January 1, we’ve seen more than 3.5 million clicks from Facebook. Much of this has been on Digg videos. Generally, we’ve been getting better at Facebook, as you can see. You’ll also note that (a) during our first year, our Facebook activity was so small, it’s barely visible on the chart below, and (b) December 2013 was a huge outlier month, right after we launched Digg Video, to what seemed to be ardent (but fleeting) love from Facebook’s newsfeed algorithm.

Our total Facebook reach in the past week was slightly more than 7 million users — all organic. Over on Twitter, we’ve got a healthy 1.5 million followers.

Tumblr is also a significant social channel for us. We launched Digg’s blog on Tumblr in December 2012. By December 2014, we had 711,700 active users (“curators” in Tumblrese, meaning users who have liked, shared, or commented on a Digg post in that month); so far in January, we’re at 788,031. Tumblr’s often regarded as a pretty contained ecosystem, so we’re particularly proud of our success there.

On Vine, we have 24,400 subscribers; across Vine, we’ve triggered 30,849,284 loops.

Digg Video tends to do disproportionately well on social networks, confirming the truism that arresting video is the most share-worthy thing on the Internet. The all-time biggest video on Digg — “What You Get When You Pour Molten Aluminum Into An Ant Hill” — garnered more than 32 million views, most of them triggered by Facebook shares. Just this week, Digg Video has gotten more than 1 million views on an insane archery video that, I have to stress, you must watch. (Seriously, stop reading this for a minute and watch that video. Lars Andersen is a beast.)

As an aside, these types of super-performing videos are responsible for some of the month-to-month peaks and valleys of traffic on Digg. In December 2013, the aluminum anthill video pushed Digg’s MAUs up over 15 million; in October 2014, a couple of strong videos brought MAUs back over 10 million; and it looks like this month, January 2015, will also end well above 10 million.

On a related note, we’ve been doing a bunch of experiments with original Digg Videos to suss out what our audience finds compelling. For example, “Every Onscreen Death in Game of Thrones, In Under 3 Minutes.” So far, we’ve racked up 3.7 million views across 15 videos on YouTube. We’ve also been testing new formats, like infographics and a series of tappable essays on things like “How to Beat Jet Lag (And Why You Get It)” and “What Cavemen *Actually* Ate on the Paleo Diet”.

Traffic and Engagement Numbers

Digg is a complicated site to measure. On the one hand, we want to send our users efficiently elsewhere, as rapidly as they find interesting things to read or watch. On the other, we want them to stick around, and return often. The Digg homepage is something users check frequently, but it’s goal is to send them away; Digg Reader and Digg Deeper are products users keep open all day. There’s a similar dynamic with our mobile and tablet apps. So we look to a bunch of different metrics, gathered by multiple analytics tools.

We track an internal metric of “reads,” which combines clicks (web, mobile, email, but not social) plus the number of stories read in Digg Reader. In 2014, we enabled between 2.5 to 3 million reads a day, give or take.

We track user engagement. Among Digg’s web users, the average session duration in 2014, measured by Google Analytics, was 4 minutes and 16 seconds (i.e., not bad!).

We close pay attention to how users get to Digg. Our ~70% direct traffic rate has remained pretty consistent even as Digg has more than tripled its MAUs.

And a fundamental metric for Digg is user loyalty. Looking at Chartbeat, 2014 visitors to the Digg homepage, video pages, and Digg Reader were 51.5% “loyal users,” 30.9% “returning users,” and 17.5% “new users.” (A “loyal user” is a user who has visited Digg 8+ of the last 16 days; a “returning user” is a user who who has visited Digg less than 8 times in the last 16 days; and a “new user” is visiting Digg the first time in the past 30 days.)

Digg Deeper

One of our most interesting product launches in 2014 was Digg Deeper, which turns a social stream like Twitter into a high-value list of the very most-shared links among your friends, updated in real time. Effectively, Digg Deeper is your Twitter friends recommending the links they collectively think you should pay attention to. Feature-wise, it’s pretty great. Personally, I love being able to see my friends’ tweeted comments on each story. Rather than scroll endlessly to read disjointed tweets about a given story, Digg Deeper pulls it all together into a cohesive and digestible snapshot. Since we launched in August, sign-ups have been solid and steady, though not huge — more than 150k in all. Digg Deeper users do, however, tend to be obsessive.

Revenue & How to Grow It

A major Digg workstream for 2014 was to get consistent revenues from advertising. By December, we were doing exactly that. We have been very deliberate about monetization — we want to do it in a way that fits our user experience, delivering something genuinely valuable to users. Our primary ad unit is a sponsored post called “Startups We Digg,” “Apps We Digg,” “Pants We Digg,” “Groceries We Digg,” etc., depending on the thing being sold. It’s what some would call a native ad, meaning that it fits into the look and feel and editorial tone of the site (though clearly marked as an ad, of course!). We run only one sponsored post a day, and it’s always a product or service that the Digg team is genuinely into. (Check them out.) They are primarily sold as performance-based ads, aimed at companies selling products, services, or subscriptions that fit the Digg audience of Internet lovers, media junkies, and early adopters.

In 2015, the challenge is to couple user and product growth with the right monetization experience, one that moves with the grain of our product experience — one that speaks Internet, and that scales. Specifically, we aim to grow revenues overall (more than 3x); to introduce new ad options, particularly for brand and entertainment advertisers; to test and decide on mobile and email ad options; and to prove that we can scale without dropping in quality or messing up Digg’s clean and uncluttered user experience.

The Return of The Digg Effect

The Digg Effect describes the surge in traffic that hits a publisher when one of its stories makes it to the Digg homepage. (It’s a flavor of the Slashdot Effect). More broadly, that dynamic is what makes Digg, with its cross-cutting, low-cost, high-leverage curation model, a valuable contributor to the online publishing ecosystem. People visit Digg to see what’s interesting and noteworthy from across the Internet; in turn, publishers that produce great work benefit from a wave of readers who might not otherwise have seen it.

Though we’ve certainly not (yet) returned to the towering heights of the old Digg circa 2007, publishers are once again noticing a potent Digg Effect. Though most publishers don’t talk publicly about their traffic stats, we saw a steady flow of posts and Tweets noticing its return. Here’s a few:

The “Digg effect” is back. — MarketingLand

You might be surprised to learn that content site Digg was our third largest source of traffic [this] year, surpassing Reddit and Google Search. …. Under Betaworks’ stewardship, editors and algorithms (instead of users) now choose which articles are on the front page. Digg has a clean design, very interesting articles, and appears to be blossoming in its second act. — Priceonomics

Wow, really interesting to see: Being on the front-page of
@digg
still drives great traffic. —
Leo Widrich

Can I just say that I’m blown away that
@digg
is the third largest referrer to my Thoughts on Google+ post? —
Chris Messina

You can always tell when Digg or someone similar picks up on a
@mosaicscience
piece — traffic shoots up 500%. —
Mun-Keat Looi

Traffic from
@digg
on this
piece about Vegas’ changing blackjack rules
suggests you’re all secret gambling addicts. —
Nicholas Jackson

Users + Engagement (& How To Grow Them)

Looking ahead, 2015 is going to be a year marked by scaling and major product launches. We’ve built a data platform and a custom content management system called Canvas that undergirds a fairly efficient and high-leverage business. We have a low cost-per-read / cost-per-view / cost-per-click — a thin layer of editors sitting atop an awesome set of social-data-rich crawling, sorting, ranking, scoring, and flagging tools. To give you a sense of scale, Digg’s editorial team consists of just six people who, as a team, work nearly 24 hours a day, 7 days a week. In 2014, that small team curated 22,013 homepage stories, 3,344 videos, and 227 originally-written pieces.

So even with the expense of editors on top of our data infrastructure and CMS, we have figured out how to effectively curate a vast and overwhelming Internet for a sizable and coherent audience. With the benefit of our existing technology stack, we think our model of [editors + algorithms] is one that can efficiently be scaled to other languages and other parts of the world. And so that’s one thing we’re going to pull off this year.

We live in a world where there is a vast oversupply of things to read, creators clamoring for your attention. One thing that is scarce, and highly valuable, is awesome curation. Digg provides awesome curation, plus clean design, a witty voice, and a bunch of useful tools and products.

To become more useful to our users — more enjoyable as well as more essential — we’re starting to incorporate social features, bringing some of the best dynamics of the old Digg back into the new Digg. With Digg Deeper, we’ve now started to bake personal connections and conversations back into our products. We’ve heard from many readers that they want to be able to see what their friends are digging, and to spark or join conversations with friends about those links. The evolution of Digg in that direction is going to accelerate dramatically in 2015. The trick, as always, is to add feature depth to Digg without overcomplicating the simple design that’s appealed to our readers.

In sum: Lots to come; watch this space.

P.S.: Can’t fail to mention this: We’re hiring! Android lead, mobile dev, front-end, back-end, platform engineers, dev/ops, editorial, revenue/sales, and more. Check out the Digg Jobs page.