notification system

System DesignReal-time & ChatMessaging & Streaming

Topic: notification system

Interviewer: 左茜

Interviewee: haha王国

Level: L5 (Senior)


Topic

Mock System Design Interview Summary

Interview Overview

Date: 5/8/2022

Target level: L5

Duration: 45 minutes

Topic covered: Notify

Drawing tool used: whimsical

Starting time: 6:20

End: 7:05

Requirements

Functional requirements

Push notification: SMS message & email

Priority:

Supported devices: mobile, PC

Non functional requirements

10M users per day

200 bytes per notification

Delay: soft requirement

Scalability, durability(must send)

System Design

External APIs

System design

How do you optimize design for 10M users?

Add some cache to frontend service

Database and processing service

We may need a queue system

Q: Notification service. Alert the user of some events. E.g. after you buy from amazon, then the system triggers notification.

A: move message queue

How to prevent data loss

Add partition, and log to database

Walk through data loss prevention

Add disks to design

How to handle large scale data?

Add Rate limiter

Q: 10M notifications per day

Total notification count: 10M/D * 365 * 10 * 200B = 7300G = 7.3 TB

QPS: 10M / 24 / 3600 = 11.57 QPS Max: 20 * average = 300 QPS

Q: Check preference of the user?

A: connect user and publishing service

Q: How to confirm the user has read the message?

A: add ACK message

For SMS: when somebody opens SMS message, there will be acknowledgment through the carrier

Q: How do we track illegal emails?

A: email ID should be logged in database

Add authentication service

Why do you want to use 3rd party authorization?

When you watch a video online. E.g. video generated by CNN. CNN may be able to validate

Q: Further improvements?

A: same user on different devices

Consistency: can improve

Interviewer and Audience Feedback

Interviewer:

Overall the interviewee did well

It seems interviewee is off track; making it difficult to ask questions

Soft skill is good

Design is sketchy. For example, QPS did not result in more improvement

Not enough coverage for stability, reliability

Should clarify requirement

I hoped for a design for high level components

Notification system should be center of the picture

Notification should be connected to different device endpoints (email, mobile, etc)

I was hoping the interviewee to focus on one branch, such as the web. How to scale, how to deal with error, how to retry.

Past midpoint of the interview, interviewee came back on track

When talking about, Message queue, ram, redis, it’s not deep enough.

Interviewee

Felt just ok

Requirement gathering was not to the point

I was back on track when the interviewer told me about the functional requirements

Deep dive:

I was not at the same level as the interviewer

I wasn’t able to follow

Low QPS: did not require big scale

Publishing service may become a bottleneck

I could have compared Redis vs Cassandra

==

Notification system has multiple meanings

Publish/subscribe

Transaction notification

If the requirement is not clear, should we declare assumption first then correct, or ask the interviewer

==

Interviewee is too passive, asked the interviewer to drive

==

Whimsical drawing tool

==

Hard skill

Why do we add the message queue

Why not add message queue between all services?

Between process service -> database

Log based, high amount of write supported

Redis can support 1M writes

Can add queue between frontend and processing service

How do you assume there is no duplicate?

You may need to de-dupe

Idempotency

When frontend service sends message to processing service

Processing service may fail after

Frontend service may need to keep on retrying

Does frontend service need to hold the HTTP connection from the client?

Process service needs to write to database, and check database before sending

Retry is important for some high priority messages

Exponential backup

How do you know the user has turned off the phone?

APN can handle it