Design Uber
Materials — open to everyone, no sign-in
Topic: Design Uber
Interviewer: ken
Interviewee: 2fet - Jun Shao
Level: L5 (Senior)
Additional Resources:
System Design Interview
Join Us on Wechat
Supporting Documents:
Design Diagram: https://whimsical.com/mingdao-KfCG71EnhDk4jtFzerK8f4
Audience Survey: https://forms.gle/y1tH4RGKxJEGa8Tg7
System design requirements
Uber driver assignment
Reference: Design uber mock interview https://docs.google.com/document/d/1NFwNTupod7jF-nQ0VG0nvoKwtDxndxS4fNAClvjJg4U/edit
Key engineering stats
https://investor.uber.com/financials/default.aspx
https://s23.q4cdn.com/407969754/files/doc_presentations/2022/Uber-Investor-Day-2022.pdf
$25.9B gross bookings
1.8B trips
5 billion trips in 2020 https://www.businessofapps.com/data/uber-statistics/
100M users in a quarter https://www.businessofapps.com/data/uber-statistics/
4M drivers https://therideshareguy.com/how-many-uber-drivers-are-there/
Functional requirements
Customer should be able to request a ride
Customer should see all drivers nearby
Driver assignment
Driver should send their location to the server
After accepting ride, driver and riders can see each other
Functional requirements
Driver pick up the address to rider
The rider places an order to a destination
A driver can pick up an order
Non functional requirements
AP/EL model. Availability, partition, else latency (quick request/response)
Availability:
Scalable: big events, rush hours
Latency
Security
Resilience: flexible at different volume
[39:20]
[my requirements]
Scaling requirements:
Driver assignment
5 billion trips in 2020. each trip requires driver assignment
5 billion / 365 days / 100k seconds per day ~= 150 assignments/second
likely high spike in traffic hours and holidays. 1500 assignments / second
Driver location tracking
4M drivers, each one sends location update every 30 seconds
4M location updates / 30 seconds = 130k location updates per second
customer
300M users
93M monthly active users
[Interviewee should clarify the scaling requirements from interviewer]
API design
For rider:
CreateOrder(userToken, fromLocation, ToLocation, Type) -> orderID
For driver:
PickUpOrder(UserToken, orderId)
GetOrderList(UserToken, DriverLocation … fromOrderId) OrderList
Stream function, server will send order events in a stream. Will hold the connection
DriverLocation is a stream as well
fromOrderId: will not pick up old order
[31:33]
Schema
UserTable rider/driver 100M riders(userId , email, address, name, phone number)
Each user takes 1k.
[where did 100M this come from? ]
1k users = 100GB 10Million
Driver 500K drivers
100M rider + 500k
100GB
Car information(userId, carId, type, desc, maker
Order 10M (10% active users, 1 order everyday) 3.6 billion orders orderId, riderId, driverId, from, to, tiestap, status
200bytes for reach order
3.6 * 2 = 1TB / year. 5 years = 5 TB
10M orders per day / 2 rush hours * 36 write QPS 1k write QPS * 5 = 5k write QPS
NoSQL - write throughput is faster
Create order
Modify order = 5x create orders
Also need to change order status
[Some concern that the interviewee did not ask for the scaling requirements from interviewer]
[QPS calculation: we may partition by geography to reduce QPS]
[21:10]
High level design
[We likely need a queue consumer]
Add location service
[why is location service connected from LB? ]
[13:04]
[what does location service do?]
Use S2 google geo library to assign a geo ID
[the system interaction flow is not very clear]
[08:08]
Q: why are there 2 arrows LB -> location service?
[location service contains driver location or orders?]
[a bit confused about the websocket servers]
Drivers need to find the websocket server based on their locations
[does the driver change to a different web socket if they drive to a different location?]
geoID = 1sq kilometer
Need quad tree to find nearby geoID
Q: how to resolve conflicts between driver
Optimistic locking
First come-first serve
Update the order - version number increases
Use version number
Q: what’s in the quad tree index
Location ID
===
===
Missing capacity
User place order
Clearer picture
Geo ID to distribute orders
Network - bidirection
===
Driver throughput can be very large
Lots of drivers, updates every few seconds
Redis pubsub
===
===
Key problems to address (重要考点):
driver location tracking
driver assignment
state transition for trip
high availability
high throughput
Bonus:
Geographic specific data structure e.g. quadtree, geohash
Soft Skills:
gathering requirements
making decisions and justifying tradeoffs
describing the solution using clear presentation, concise language and accurate technical terms
Hard Skills:
design quality; scalability, reliability, efficiency etc (L4, L5)
basic facts about existing software solutions and hardware capabilities (L4 - partly, L5)
project/product lifecycle awareness, e.g. how a project is developed and maintained (L5)
====
Discussion after Interview