Machine learning for Facebook Newsfeed

Topic: Machine learning for Facebook Newsfeed

Interviewer: 张小南

Interviewee: 恰恰

Level: L5 (Senior)

Topic

Mock System Design Interview Summary

Interview Overview

Date: 4/17/2022

Target level: L5

Duration: 45 minutes

Topic covered: ML for FB newsfeed

Drawing tool used: excalidraw.com

Requirements

Functional requirements

FB newsfeed

1 no queries

2 no ads

Problem formation

Improve relevance by recommending different content: images, videos, texts, lives

Candidate generation: return relevant items

Data and metrics

Labels: clicks, likes, comments, shares, reposts

De-nosie of label

Offline metrics： mar@k, auc; map@k, f1, mrr

Online metrics: # of clicks/comments.shares/reposts per user session

Feature engineering + pre_processing + multi-modal database

User feature: user ID, preference/tags, user embedding

Author feature: author ID, preference/tags, embedding

Newsfeed feature: newsfeed id, content of newsfeed (text, image, video ), engagement of newsfeed over past 1 hour, 24 hours, 7 days

Context feature: time of the day, day of week, device, holidays

User newsfeed cross feature: similarity between user and newsfeed

User author cross feature

Two tower model: dense model to create embedding. Minimize loss: positive pair of user and item should be ranked higher than unrelated pairs

Q: why use ID as feature?

A: they are treated as categorical features. User ID contains implicit information, when you embed, you can learn a lot of semantic meaning.

Model architecture possibilities

Candidate generation: two-tower model

Ranking:

Logistic regression. Cannot process categorical features well.

GDBT/neural net for preprocessing, and input it as logistic regression

wide and deep model: losing sequence information

deep interest model. Can improve wide and deep, and preserve sequence information

Caveats: positional bias, diversity, cold start problem

Probability of clicking: relevance of the item, and rank of the position

We can include position as variable + interaction with position + device. During inference, we can set it to zero.

Or through inverse perplexity.

Diversity. Multiple item from same author we can downrank some of them.

Maximum marginal relevance: for a candidate, if it’s highly relevant to the user, we will give it a boost; if it’s super relevant to already-displayed candidate, then we will give it a downboost

Q: Why is diversity important?

A: It will avoid the problem of popular item being shown, and unpopular item not being shown.

Q: Relevant items should come close to the top due to similarity calculation.

A: Popular item gets more popular. Also helps with cold-start problem.

Train/test split, model evaluation

Training: month of data. Use first 3 weeks. Cross validation.

Random sampling as easy negative sample. 100-500 from recommended list as the negative case. This can improve precision

a/b testing

Randomly assign

Partition strategy to divide social network.

Monitoring and retraining

Online, offline, system metrics. How often to retrain model depends on requirement and capacities.

Q: How do deal with cold start? You don’t have the label.

A: multi arm-bandid. Give user some item to start with; we can reward the item that’s more used.

Or train a model using the features that are shared between mature items and cold start items.

Distillation may help too, but we need to verify through measurement.

Q: what can be the bottleneck of the system in real world

1 latency, return as quickly as possible the content. To achieve low latency, we use two tower model. We can calculate the embedding of all the items and save it offline. As user comes online, we can use the embedding. Tradeoff storage and time

2 order of magnitude of items. Billions of contents. Millions of users on users create content. How do we store the content? General idea is to save based on partition. Run our model on different partitions, and blend them together and rank the final results.

Non functional requirements

System Design

Key algorithms

Collaborative filtering

Semantic based filtering

How to measure the relevant:

Likes, comments, shares, reposts

We know about the meta data

2 tower model to see which items among all candidates are most relevant to recommend

External APIs

System design

Interviewer and Audience Feedback

Interviewer:

Clear outline

Asked for feedback. It’s an open question, lots of solutions. As long as you present multiple options and discuss tradeoffs, it will be fine.

Solution if it’s applicable? It’s fine to ask if I’m ok, if I should continue.

Improvement:

Some solutions are overly complex

Usually ML questions are phrased vaguely. Some ML questions need clarification

I started with vague, high level solution

Interviewee dived too deep into complex solutions.

Interviewer wants to hear a high level system design.

Need more research/homework on the product. There are some often used products.

Facebook newsfeed, what are their specialty?

Sometimes it’s not as complex.

Interviewee:

I asked interviewer to mock interview me.

I need more practice.

===

Soft skill

Interviewer

Try to understand a simple scenario, before listing 1-7 outline

Try to discuss what is newfeed. Which steps can leverage machine learning

Newsfeed, search: sometimes there are 2 steps.

How to gather relevant content - may not need ML

Ranking - needs ML

Hundreds of millions of user

Billions of user

So ML probably cannot be used on step 1.

We may just start with friends, and friends’ friends.

Hot topic: lots of text. Can pre-cluster text.

We can narrow down the candidate very quickly.

You don’t need to embed user, content or cross.

Should discuss first which steps requires ML.

Some interviewers want to test you on reading papers

Some interviews want to test you on practical knowledge.

Audience:

Different specialties have very different knowledge.

If the interviewee may or may not have worked on relevance.

If interviewer background is NLP (but not relevance), then today performance is good.

But if the interviewee has specialty in relevance, then today interviewee has not picked the most relevant content.

Today is about ML system design. Usually it’s not very deep.

Audience

Don’t pile too many models.

Audience

Cold-start problem

Social or shopping.

Cold start is not just at the level of algorithm

Online serving - new item, new news

How do we do retrieval? How do we combine cold start versus mature.

Cold start is a very big problem.

Interviewer

There are no perfect solution

Look for something that makes sense

“Let me think about it for a minute”

Today: some problems are proposed. Users: no history - then based on popularity

Item: multi-banded experiment.

Solution 1: wait till you have enough data

Solution 2: label the data manually. Topic centric.

How to scale up. Depends on the background of interviewer.

Audience:

New item. Related to levels of friends. Can we use this?

Interviewer: yes

Search and ranking

Each product has different solution

There is no template

I probably will draw some block diagram:

Gather data, generate feature, generate candidate, ranking. Each step we may discuss if we need ML, what model we need.

Some model are too sparse.

Research scientist - don’t worry about online/offline

No standard solution, number points, block diagram. As long as there is a good outline.

Audience:

ML system - usually solving part of a larger problem

45 minutes - how do we connect this system with larger goals (e.g. active users, click through rate)

If I am an interviewer:

How to develop model offline

Then how it goes online

In the end 5 more minutes.

Then what are the useful online metrics: this requires business sense.

Precision/recall may be a tradeoff. If we point this out we may get bonus.

Interviewer expectation:

Which models are commonly used in the industry?

What happens if big issues arise?

social recommendation

Data pipeline, model training, online, logging, AB testing

Then I asked about focus

Do I need to dive deeper? Or can I move on? Drive toward the aspect you are strong at.

Communicate more.

General ML design, what are the big categories? What reading material is easy/fast.

YouTube search

Recommendation system (ranking, search)

Anomaly analysis . Read infoQ - stripe, ebay

A list of 10 topics. Draw a diagram

Feature -> model -> result -> evaluation with benchmark

Most important is to set up a system.

I have not narrowed down the questions.

Diversity. Was it reasonable?

Relevancy - not makes sense

Hot topic - hard to keep diversity. Related to cold start

Always see the same content may get the user bored

Cold start

Should be able to pass L4

L5: needs to be familiar with the work

Not as experienced in this part.

Over complex: shows inexperience