5 Models for Engineering Personalized Digital Experiences (Part 3 – Facebook)

This article is the third of a three-part series on algorithms and predictive models for creating personalized digital experiences. Part 1 includes the introduction and case studies from Amazon and Netflix, while Part 2 provides case studies from Spotify and Pinterest.

In the final installment of this series, and saving the best for last, we’ll tackle Facebook’s infamous News Feed algorithm.

Facebook's News Feed

With 79% of American online adults using Facebook, the social network’s News Feed algorithm is probably the one people are most used to interacting with.

The importance of Facebook’s algorithm can be summed up in two quotes.

Joaquin Candela, Director of Engineering for Applied Machine Learning at Facebook has stated, “Facebook today cannot exist without AI. Every time you use Facebook or Instagram or Messenger, you may not realize it, but your experiences are being powered by AI.”

And CEO Mark Zuckerberg said, all the way back in 2014, “Our goal is to build the perfect personalized newspaper for every person in the world. We’re trying to personalize it and show you the stuff that’s going to be most interesting to you.”

A quick snapshot of my News Feed.

A quick snapshot of my News Feed.

To create that personalized experience, the Facebook algorithm takes over a hundred data sources to assign relevancy scores to each bit of content a user could see in their News Feed and attempts to predict what each user would most be interested in.

Now, let’s dig into some of the larger data points.

Engagement and Time Spent

Since 2007, the Like button has been a way to gauge interest in content on Facebook. Other signals of interest are the reaction buttons Facebook unveiled about a year ago. Along with commenting and clicking, both register engagement and are among the data inputs that influence personalization on Facebook. However, they can also be deceptive.

For instance, someone may read a sad post that’s important to them, but they are unlikely to want to like it or comment.

To solve this problem, Facebook registers the time someone spends on an article after clicking it. They also register whether someone likes a post before or after reading it since Facebook found that if someone likes a post before they read it that there is a much weaker tie to actual sentiment than liking it afterward.

As a result, in June 2015, Facebook began prioritizing stories users spend more time viewing. In an article for Slate, Max Eulenstein, a product manager explained how time spent is measured by saying, “It’s not as simple as, ‘5 seconds is good, 2 seconds is bad’. It has more to do with the amount of time you spend on a story relative to the other stories in your News Feed.” The notion of relativity is important as variables like internet or wireless speeds can influence the time it takes for articles to load, and thus, time spent.

Engagement with video is also determined in a similar fashion. Not all users are going to like or comment on a video, but the News Feed algorithm factors in actions like turning up the volume, enabling HD, or watching a video to completion as signals of positive sentiment.

In all of these instances, the positive signals are fed into Facebook’s algorithm to determine relevancy scores for each piece of content that could be delivered to each user.

Relevancy Score 

The Facebook algorithm assigns a score to every piece of content a user could see based on how likely the user is to interact positively (and negatively) with each. The score is determined through a number of factors including previous engagement and if the post comes from a person instead of a followed Page. This score, known as the relevancy score, is the crux of the Facebook algorithm and personalization. The score is also specific to each user and each piece of potential content for that user.

Once every possible piece of content available for a user is given a relevancy score, Facebook employs a sorting sub-algorithm to order all of the posts into how they’ll appear within the user’s News Feed. Therefore, the post you see at the top of your feed is the one the Facebook algorithm believes you are most likely to engage with over all of the others you could possibly see.

Relevancy scores are continually being tweaked, as well. Take for instance that negative and positive engagement is weighted the same way for every user. However, in an interesting twist, as noted in the aforementioned, Facebook’s data scientists discovered that a small proportion of users—5 percent—were doing 85 percent of the hiding. As noted in the aforementioned Slate article, “When Facebook dug deeper, it found that a small subset of those 5 percent were hiding almost every story they saw—even ones they had liked and commented on. For these ‘super-hiders,’ it turned out, hiding a story didn’t mean they disliked it; it was simply their way of marking the post ‘read,’ like archiving a message in Gmail.”

Consequently, the algorithm has been fine-tuned to identify these users and account for their behavior.

Another way the algorithm and relevancy scores are constantly optimized is through the Feed Quality Program. The Program uses a Feed Quality Panel to manually rank posts of interest to them, which is then compared to how the algorithm sorts the content to see how accurately the algorithm is performing. Additionally, Facebook will run one-off surveys asking users to select a post (out of two options) that is most interesting to them and runs the results against what the algorithm would predict.

Timeliness and Authenticity

Two additional factors for Facebook’s algorithm worth noting are timeliness and authenticity.

It is Facebook’s best interest to serve users content that is both relevant and recent. As such, posts that appear as viable options for the relevancy score and sorting must come from within a particular time range of when a user opens Facebook.

Earlier this year, Facebook also updated the algorithm to take into account how signals change in real time. Facebook data scientists noted how this happens in a recent blog post stating, “if there is a lot of engagement from many people on Facebook about a topic, or if a post from a Page is getting a lot of engagement, we can understand in real-time that the topic or Page post might be temporarily more important to you, so we should show that content higher in your feed.”

The example they provide is if your favorite soccer team just won a game, Facebook may show you posts about the game higher up in the News Feed because people are talking about it more broadly on Facebook.

Within the same update, Facebook took steps to combat fake news and determine the authenticity of posts. Again, as the Facebook post notes, “To do this, we categorized Pages to identify whether or not they were posting spam or trying to game feed by doing things like asking for likes, comments or shares. We then used posts from these Pages to train a model that continuously identifies whether posts from other Pages are likely to be authentic. For example, if Page posts are often being hidden by people reading them, that’s a signal that it might not be authentic.”

Some machine learning is also being used to determine content authenticity. A model called World2Vec (not to be confused with Word2Vec) helps Facebook tag every piece of content with information, like its origin and who has shared it. With this information, Facebook can understand the sharing patterns and qualities of fake news stories and tap machine learning to identify the stories.

In the end, if a post is deemed likely to be authentic it may also show up higher in a user’s Feed.

Thank you

Thanks for reading! Feel free to connect with me on Twitter and/or LinkedIn.

You can also join my newsletter to receive the latest posts, digital marketing news, a marketing book recommendation, and a custom playlist every month.