Design Calendar(used)
Topic: Design Calendar(used)
Interviewer: Becky
Interviewee: Cheng
Level: L5 (Senior)
Topic
Mock System Design Interview Summary
Interview Overview
Date: 11/21/2021
Target level: L4
Duration: 45 minutes
Topic covered: Calendar
Drawing tool used: whimsical
Requirements
6:15pm
Functional requirements
View schedule
Create events, add guest to meetings
Respond to events, accept/reject/tentative
Bonus:
auto decline on current schedule
Book re-occuring event
How the user can see their schedule for the week
Event created by this user (event that belongs to the calendar owned)
Event as attendees
How users create event
Non functional requirements
10k users version 1
1 million users in 1 year
System Design
Database choice: relational database, due to ease of use. Not due to scale.
External APIs:
System design diagram
Database schema
Added requirements:
How the user can see their schedule for the week
Event created by this user (event that belongs to the calendar owned)
Event as attendees
Given the userID, and a time range
Get all calendars for that user
For the calendar, get all event IDs
Search on event table with: select eventinfo from event where ids in () start_time > xxx and end_time < xxx
Q: User hover over an event, how to see the detail
A: issue another query
Q: How to create an event
POST createEventUserID (from auth), Calendar_Id, random_key_for_duplicate_request
Q: How would guest get notification:
A: add notification service
Q: How the attendee get the event and respond to the event?
A: the attendee get a link to our app, with info contains event_id he needs to respond to
Q: how to scale the system to:
1M users, 10M calendars
Read traffic 100k QPS
Write traffic 10k QPS
A: break down by operations
Create event
Respond to event
A: Recent 3 months have more reads
Store events of recent 3 months in cache
Q: What is the replication strategy when a user creates an event?
A: not familiar with Cache
Q: e.g. user create or delete an event
A: App can write the database first, then write the cache
Q: 2000 events for each user, 2B requests for 1M users
A: database is for recent events, and separate storage for historical events
Additional design
Candidate:
Requirement: a user can set an event to public, and other person who can see the user will know the public event of this user.
Interviewer:
out of scope
Interviewer and Audience Feedback after the Interview
Interviewer:
Basic functionality
Ran out of time for scalability and fault tolerance
Read/write pattern, high QPS vs low QPS
Performance, consistency
For this interview:
Book meeting, write heavy: no-sql
Need some time: for scale and fault tolerance
Initially 10k user, stateless, infinite
DB sharding, horizontal scale
Fault tolerance, high availability
Cache - is good
Key is replication strategy, write-through or cache aside
Show knowledge
Audience:
SQL: lots of relations
Without SQL, difficult to query
Very strong reason
Audience:
Ever growing
Sharding, join across shards, difficult to design
Audience:
Hot and cold database
Hot database, smaller data size
Audience:
What database for cold database
You can use SQL database
Audience:
Then there is no difference between cold and hot database
You can separate to a different able, with different index
Hot and cold, what’s difference?
Audience:
Hot database: use more replicas
Cold database: use fewer replicas
Use the same index and table schema
Audience:
Notification: can be locally computed on client device
2nd level buffer: can be in the cloud
Normally cache aside is fine
First read from cache
Cache hit: return
Cache miss: go to database
When we write to database, we can delete from cache.
Source of truth is database
Database is easier to fail than cache
Therefore, update database then delete from cache
We should add a message queue for database for writing
Audience:
Do you care more about database schema?
Or care more about scalability?
Interviewer:
Basic, high level sense for scalability
Schema: may change based on requirement. We can accept a small % of error in database schema.
Schema serves performance and sharding
Audience:
I feel data model is key
Queue, sharding, no SQL: normal process, does not differentiate
Interviewer:
Read performance is not good when there are a lot of join tables.
Should talk about the tradeoff of schema
Audience:
SQL database vs NoSQL database is important
Sharding: schema should conform to query needs so the same query goes to same shard
Audience:
Event: puts into user calendar event
Event can be no-sql
Audience:
User should have its own table
User calendar: shard
Audience:
Based on query
For example, where userID = …
Sharding key is userID
Another team:
Where calendarID=…, or eventID=…
Sharding key can be calendarID or eventID
They may duplicate the data
But use different sharding keys
Audience:
Different product line
Can use same schema but use different sharding keys
Can I just maintain 2 sharding key with 1 table?
Audience:
Sharding is difficult `
If a single shard, can just add an index
Use 2 tables => primary reason is for sharding and scaling
空间换时间
Audience:
We need to use user ID
Most people want to see their own calendar
Sharding use userID, use SQL DB
Attendee: cascade write into other users’ tables
Event ID can be used as a sharding key
User ID can use a different sharding key
If 2 different types of queries on the same table, the table can be written twice
If 2 different types of queries on different tables, then sharding key can be different
Audience
10k versus 1M users
Audience:
Why is there a user-event joint table.
User and calendar is a many-to-many relations
Shared calendar
Interviewee:
Seeing other people’s calendar
Audience:
Maybe the interviewer did not have requirement, but interviewee expanded too much
Interviewer:
If there is a good answer, then you may not want to dig a hole
But if there is no good answer, then may not
Audience:
3 requirements -> expects 3 answers
If meets additional requirements, then
Audience:
Higher level sense is required for higher level people
L4 & L5 need more guidance
The interviewer can control the flow but saying some points are out of scope
Audience:
Should we start from simple cases first?
Then we can discuss a bit more scalable
Audience:
It’s good if the interviewee can control the flow
Interviewer may provide a big scope, but interviewee can break down into simple scope first
Audience:
If I say too little, then I don’t show enough knowledge
If I say too much, then I may start too many threads without enough time to cover it
Audience:
Interviewer had good pacing
Started from simple requirements
Then add more clarification in the middle
Audience:
Project life cycle
Monitoring, and logging are proper extensions of the interview
Analytics
Core functionality may take too much time
Audience
After implementing core feature
Extend to cloud watch
Audience
10 minute reminder
There should be a timer to find out which event is about to start
Audience:
RockingMQ, RMQ, can take a delayed message
We can send the event to the client side. The client can send the notification itself
If an event changes, then save and cascade to other users, then inform the client. The client can handle the notification.
Audience:
Currently it’s service design, should we push to the clientl
Audience
业务前置 边缘计算
If the interviewer does not accept
We can also build a separate table with reminder time, then we can scan the reminder table
Audience
长链接
Web socket
1-2 seconds pull
Audience
Delayed queue: can reduce query to the database
Put it in the queue.
When the event is created, then put the event in a delayed queue
Audience
Dropbox, async queue
They still continuously scan
AudienceAll events need to be put in the queue