Design Uber · commitway

Topic: Design Uber

Interviewer: system

Level: L5 (Senior)

System Design Interview - Design Uber

Join Us on Wechat

Subscribe to Our YouTube channel https://commitway.com/youtube

Past mock interview videos are at “Live” section

Requirements:

[4:25]

[8:32]

[10:00]

[11:36]

API design

[15:00]

System design

Hope to walk through APIs to understand the functions of each module

[18:56]

[Which service manages the location data ?]

[20:00]

Design database schema

[ Like to understand the ownership of different tables]

User: ID, name, score

Request: ID, start loc, target loc

Driver: ID, name, score

Trip

Combine request_rid and request manager

[24:00]

Walk through API calls

API get_nearby_drivers

Rider -> LB -> nearby driver service

Request_ride:

Rider->LB -> request_ride service (do the match) -> LB -> rider

Rider->LB -> request_ride service (do the match) -> nearby driver service

[hope to ask about DB design for location queries]

[may be async request to find multiple driver candidates ]

get_nearby_requests

accept_request

Driver -> LB -> request manager

Cancel trip

Driver -> LB -> request manager

Rider -> LB -> request manager

[30:00]

Q: how to let drivers and riders see each other?

[how do drivers and riders update location?]

A: add location service

rider->location service

Driver -> location service

[Location service appears redundant with nearby driver service]

[33]

Async call to match rider with driver

Q: How to inform the rider when match is found?

A: polling or live connection

Like to use a reverse proxy solution. Too much polling

Q: how to keep live connection

A: websocket

Q: how to get the location of drivers?

A: driver call nearby driver service

Location is saved in memory

Q: how to handle service crash of nearby driver service?

A: driver can reconnect to another

[38:14]

[ 10 updates a seconds appear to be too high ] ?

40M times per second

Q: How to handle high throughput?

[Trying to lead to quadtree data structure]

A: divide by geography

Q: read throughput?

A: need to calculate based on active users and how many times they take rides

Read request: 10k per second

[still a bit unclear division of responsibility between location service and nearby driver service ]

A: Location service handles all of the location needs

Q: how to partition data in location service?

QuadTree or Geohash

[ In addition to we may also look up based on rider and driver ID ]

[47:52]

===

Interviewee:

Key may be the geo data structure

Time control

Clear presentation

Go across cells

L5 L6 ~2 rounds of system design

===

Audience:

May not need LB at the beginning

How to handle cross cell cases?

Driver update can find old cell, delete record, and then insert new record in new cell

Finding driver when passenger is near the border of a cell

Quad tree: Neighboring cells share parent / ancestor nodes

Geo hash: from encoding we cannot handle it easily. Save the geohash of bordering cells

===

Upcoming events:

Next Thursday: performance review