Design Uber
Topic: Design Uber
Interviewer: system
Level: L5 (Senior)
System Design Interview - Design Uber
Join Us on Wechat
Subscribe to Our YouTube channel https://commitway.com/youtube
Past mock interview videos are at “Live” section
Requirements:
[4:25]
[8:32]
[10:00]
[11:36]
API design
[15:00]
System design
Hope to walk through APIs to understand the functions of each module
[18:56]
[Which service manages the location data ?]
[20:00]
Design database schema
[ Like to understand the ownership of different tables]
User: ID, name, score
Request: ID, start loc, target loc
Driver: ID, name, score
Trip
Combine request_rid and request manager
[24:00]
Walk through API calls
API get_nearby_drivers
Rider -> LB -> nearby driver service
Request_ride:
Rider->LB -> request_ride service (do the match) -> LB -> rider
Rider->LB -> request_ride service (do the match) -> nearby driver service
[hope to ask about DB design for location queries]
[may be async request to find multiple driver candidates ]
get_nearby_requests
accept_request
Driver -> LB -> request manager
Cancel trip
Driver -> LB -> request manager
Rider -> LB -> request manager
[30:00]
Q: how to let drivers and riders see each other?
[how do drivers and riders update location?]
A: add location service
rider->location service
Driver -> location service
[Location service appears redundant with nearby driver service]
[33]
Async call to match rider with driver
Q: How to inform the rider when match is found?
A: polling or live connection
Like to use a reverse proxy solution. Too much polling
Q: how to keep live connection
A: websocket
Q: how to get the location of drivers?
A: driver call nearby driver service
Location is saved in memory
Q: how to handle service crash of nearby driver service?
A: driver can reconnect to another
[38:14]
[ 10 updates a seconds appear to be too high ] ?
40M times per second
Q: How to handle high throughput?
[Trying to lead to quadtree data structure]
A: divide by geography
Q: read throughput?
A: need to calculate based on active users and how many times they take rides
Read request: 10k per second
[still a bit unclear division of responsibility between location service and nearby driver service ]
A: Location service handles all of the location needs
Q: how to partition data in location service?
QuadTree or Geohash
[ In addition to we may also look up based on rider and driver ID ]
[47:52]
===
Interviewee:
Key may be the geo data structure
Time control
Clear presentation
Go across cells
L5 L6 ~2 rounds of system design
===
Audience:
May not need LB at the beginning
How to handle cross cell cases?
Driver update can find old cell, delete record, and then insert new record in new cell
Finding driver when passenger is near the border of a cell
Quad tree: Neighboring cells share parent / ancestor nodes
Geo hash: from encoding we cannot handle it easily. Save the geohash of bordering cells
===
Upcoming events:
Next Thursday: performance review