Newsfeed Real Time Comments
Materials — open to everyone, no sign-in
Topic: Newsfeed Real Time Comments
Interviewer: Shawn
Interviewee: Anna
Level: L5 (Senior)
Additional Resources:
Topic
Mock System Design Interview Summary
Interview Overview
Date: 1/30/2022
Target level: L5
Duration: 45 minutes
Topic covered: Realtime news feed comments
Drawing tool used: whimsical
Vote for future interview topics:
Thursday Feb 2/3/2022 职场提升俱乐部活动:
“有不少朋友给我们反馈,希望职场提升俱乐部能帮助完善自己的简历,俱乐部决定在2月3号周四晚 举办一次关于简历准备的讲座,欢迎大家点击下面链接填写报名,上传简历。为了保护隐私,上传的简历可以隐藏自己的姓名/电话/地址等个人信息。”
https://forms.gle/jYZykQCQksjEj2P97
Requirements
Functional requirements
Design live comment, real time
User can see comments/posts in real time
Real time: low latency, as low as possible. 100 ms
Any time of post, can be text or video, but let’s focus on comments
Only friends can give you comment?
Focus on the post you can see. When people
Focus on real time delivery
Deprioritize permission of comments
Scale estimate:
How many posts do we have?
100M posts
10s of millions of posts every second
Any posts can receive comments
100k/s comments can be delivered to potential readers
Every user can post x comments per second, but it’s out of scope
1M daily active users
Latency is the top priority
Don’t worry about calculation of scale too much
Non functional requirements
System Design
External APIs
Push model
Pull model
Push - can keep sending the messages
Pull model - will cause delays in message delivery
Experienced with Kafka as message queue
Q from interviewer: Why do you need to persist comments?
Q from interviewee: some scenarios are not realtime. E.g. if you click on an old post, you should still see the comments
Each comment will be produced into one topic
What counts as a topic? Each post can count as one topic.
This may require 10M message queues
Audience chat: reference LinkedIn design https://www.infoq.com/presentations/linkedin-play-akka-distributed-systems/?useSponsorshipSuggestions=true
Consumer: subscribe to his post’s topic, not every topic
Q from interviewer: why only the consumer’s own topic?
A user may be looking at one comment
Scroll down
The comments may grow
Subscribe to his news’ window -topics.
Size 5-10 posts / topics
Not every topic
Once scrolled up, the consumer can unsubscribe to that topic
Q: What are the tradeoffs when consumer subscribe to many topics vs fewer topics?
Q: message queue - cannot directly push the message to the mobile client
A: Add frontend service for comment receiver. It consumes the message queue, and translate the message to the right format
1M users: how do we fan out, for celebrity posts?
Let’s say there are millions of users looking at the same post.
How do we fan out
How does frontend service and comment receiver communicate?
A: CDN service can help deliver the content to the user
Q: CDN is not the push model
How do we let frontend service push the message to the user
Websocket can be a solution
Q: there are many frontend service, and many comment receivers, how do we know which frontend service serves which comment receiver?
A: sharding, partition
Each comment receiver sends the frontend service the list of postIDs
Q: how does the frontend service which post ID sends to which receiver?
A: it’s a dynamic. User keeps on scrolling. Receiver may have a scrolling window. Let the frontend service know “here are the 10 comments I am going to read”
Q: The list of 5-10 postIDs keep on changing, how does the frontend know the up to date list?
A: you can add a cache to the frontend service.
Cache / fast storage
1M posts, 1M users
How does the frontend know which post to send to which comment receiver?
A: comment receiver connects to the frontend service, within that request there is a list of posts interested in (5-10 posts). If there is no update, then the user will keep on looking at these posts. If there is update, it means the user moves to a different post.
Q:how does
A: each comment sender send to frontend service. How do you communicate?
Websocket, or HTTP requests. For faster communication, can use websocket
Interviewer and Audience Feedback
Audience Scores
Soft skills
Hard Skills
Interviewer Feedback
Newsfeed, livefeed, requirement gathering is difficult
Design low latency system is difficult
Should meet L4 bar
L5 has higher bar
Push vs poll
How to push which comment to which users
Push which posts to which comment receiver. Need more clarification
Use of Kafka/pubsub model for push. It may work, but there are a lot of topics. Every subscriber may subscribe to all posts. In the frontend service, each user may subscribe many topics.
Kafka is acceptable.
Requirement gathering. It is satisfactory.
Asked the interviewee not to do calculation
Need more practice
Followed the interviewer’s suggestion
Hard Skill
Interviewee:
Based on research, not very workable solution
Facebook: Push vs pull
Push is fast enough
Write locally, read globally -> don’t quite understand
Write to local region. Read is global
Facebook post
Audience: world wide readable. Write to local
Facebook did not explain clearly
Write speed is high
Every comment is read, therefore
Read heavy: should “read”
Write - write relationship, not write comment
Continue to scroll the screen. Post vs user is continuously updated. Each frontend service knows which post the user is a reader. It is updated continuously.
Currently there is a relationship between user and posts
Cache stores this relationship
Write intensive
Every friend sends a comment (like). There are a lot of writes
Many people like, then there are a lot of broadcast
Websocket - bidirectional client-server communication
Payload can be pushed from server (mobile app, client app)
Server send event (similar to HTTP) initiated by server, and send to client
Client gets the message then renders on mobile client
YouTube, linkedin did a real time system for comment (comments/like)
They have a in-memory local cache to manage client and frontend relationship
Fanout is large. 1M clients. Many frontend servers.
Each frontend server hosts 1000 clients. A large set of machines to handle connections
There are dispatcher to manage frontend node and post mapping
“Streaming a million likes/second: Real-time interactions on Live video” Linkedin
https://www.youtube.com/watch?v=yqc3PPmHvrA
Do they send the updates to frontend nodes at different data centers?
If there are multiple masters, it’s hard to
Write locally: write to a local data center
East User comment on a post
West user comment on a post
Write locally
Somewhere they need to merge the comments
There is probably an aggregator to merge the comments
Who manages the sequence?
Writing the relation: receiver is writing locally
Push center is reading globally (who is reading this post)
Write heavy - is writing the relation
Comment writing is a different issue.
LinkedIn youtube video addresses the write of the relations
The sequence may not be the same
Comment sequence is a different question. I can first see the answer and not the question.
Facebook: you can reply to previous comment. Then there is “previous ID”
If there are no causal relations, just use timestamp
Distributed system, data center write, synchronization problem
In distributed system, there is an expensive solution to ensure strict ordering
But in facebook, it is probably a tradeoff to choose low latency and sacrifice consistency
Local: data center local. Not client
Frontend service, local cache. Local to the user (same area)
Global: same post’s replicated in different
A post is created
Globally the post can see the post. They can be seen globally
There is a new comment. The server needs to find the frontend servers that maintains the relations of user->post mapping
Every time a new comment -> collect mapping from different data centers
Cassandra -> quorum to decide
Tolerate the error
Write transaction is fast
Read globally - not every data center gets the latest copy
Sync between global is expensive
Comments’ server will pull from different data centers’ aggregators
Global sync is expensive
A new post -> Will map the post to related users -> then push the post to the users’ database
They don’t need to sync the post globally
If there is a writer to write a comment
Is there a global service that knows the comment has been written?
Dispatch the comment to related readers?
Writer data center -> reader data center point to point connection
Writer -> reader is 1-to-N. There is not a global service in charge of all communications
Frontend server: after crashing the user can reconnect to a different frontend server
If there is no local data center disk copy
How does a new server knows which post a user has finished reading?
Interviewer: I don’t think this needs to be sticky
Data center 1: frontend server crashes
User can connect to data center 2
Which user is reading which post
Other people needs to fan out comments to me
The system needs to maintain user -> post mapping
Poll vs push: need to use push to justify low latency
Reduce network cost.
If no websocket, server send event.
Long poll? Client connect through HTTP. Server holds on to the connection with keep-live. If server gets the answer, then the response can be sent back. Client reconnect to the server right away. Similar to websocket.
More connection each server, the better.
Long poll is less efficient than websocket
Long poll is different from push vs pull. Long poll is only a connection mechanism.
Websocket is much more efficient
Traffic is too big for continuous pull
Push vs pull may still depends on situation
E.g. highly popular topic. Need to push to many clients. Say if you need to 100M people
You can use multiple queue to push the message
Hot topic: pulling is better
Push: if only between colleagues and friends, no
Pull: depends on how faster
Push: fire and forget
Kafka:
consumer group
User can be partitioned. Different queues
Each server may go to kafka to get the message they need to subscribe to
Kafka:
Too much fan-out, it may crash kafka
Newsfeed - only for online users
No need to support offline users
Justin Biber: need to fanout to millions/10s of million
We can add more machines to handle the fanout
Offline user: can pull
Online user: live chat, requires push
Realtime: we don’t need to use queue
Just keep a map in memory