Ride Sharing System
Topic: Ride Sharing System
Interviewer: traveller(旅行者) (traveller2050)
Interviewee: readZhu (yanbin)
Level: L5 (Senior)
QRCode
Topic
Mock System Design Interview Summary
Interview Overview
Date: 6/26/2022
Target level: L5
Duration: 45 minutes
Topic covered: Ride sharing system
Drawing tool used: excalidraw
Requirements
Functional requirements
Customer: driver
Customer should be able to request a ride
Customer should see all drivers nearby
Driver should send their location to the server
After accepting ride, driver and riders can see each other
Scaling requirements
300M customer, 1M drivers
DAU: 1M active riders: 1M / 24 / 60 = 2K TPS
DAU: 500k active drivers: 500k * (24 * 60 / 3) / 24 / 60 = 1K TPS (every 3 seconds we need to push notifications)
System Design
External APIs
Restful API vs GRPC. RestAPI for external. GRPC for internal
Restful API:
domain/ReidRequest
Post request
Payload {
customerId,
Latitude,
longitude,
}
domain/nearByCars?longitude={}&latitude={}
GET
domain/driverLocation
POST
payLoad {
driverId,
Latitude,
Longitude,
}
Response:
[
{ latitude, longitude }
]
Interviewer: Do we need to return a driver ID?
Interviewee: we don’t need it because we just need to display cars on the map.
domain/getMyDriverLocation/{customerId}
response: {
Latitude,
Longitude
}
System design
Database models
TralTreeIndex
How to find the nearest driver?
We cut the map into grid, with small cells. From longitude and latitude, we can find the grid cell quickly.
TraTree: gridId, startLat, endLat, startLong, endLong
LocationTable (this is in memory, but we still call it table)
driverId, curLongitude, curLatitude, preLongitude, preLatituden
driverInGrid(gridId, driverId)
Ride, customerId, driverId,
Interviewer: how do you know which driver is nearby
Interviewee: we will get there.
.
We need to support a lot of updates. Traditional DB cannot handle it
We can use an in-memory db. If it crashes it will be re-filled very quickly.
Interviewer: can you show me the whole flow?
Let’s go to the critical flow first. We can start from requesting ride.
Client -> LB
LB -> request ride API
Request ride API -> driverInGrid, get drivers in the grid
Send notification to drivers that are best matched
Based on rating
Based on preference (e.g. car type)
Interviewer: How do you know which grid we should look up driver from?
Interviewee: We can use a binary search. For every grid there is a start and end long/lat
For every location we can get the grid in log(N) time
We can dive the world into 4 piece.
Leaf node will be stored in “driverInGrid” ID.
Interviewer: what if the leaf node doesn’t have enough drivers? E.g. only 1 driver.
Interviewee: If we have more than 500 drivers in a cell, we can break into cells. We can also merge cells back.
Interviewer: what’s the message queue about
Interviewee: we can choose push or pull to deliver the requests. We will use push to increase efficiency.
Interviewer: what if no driver responds?
Interviewee: we can wait for the driver to respond. We can have a timeout of 10 seconds-1 minutes.
We can also send to multiple drivers.
Interviewee: it’s easier to send 1 by 1; this avoids concurrency issues.
Interviewer: what happens if push failed?
Interviewee:
Audience feedback
Interviewer Feedback
Functional requirement is good
Scaling requirement should be improved. Reliability, latency
Scalability, 3 minute. Should add 60 seconds in division
API: nearByCar. String arguments
Schema of QuadTree: grids with grid. Interviewee talked about binary search
There is some problem with pushing. Usually we should use websocket.
It’s not necessary to add notification service
Choosing Redis is fine. I haven’t heard about tradeoffs.
Overall, a workable solution
Need more tradeoffs. Time control. We didn’t have time to discuss sharding,
We should notify multiple drivers.
Interviewee self reflection
Time control could be better
Estimate/API:
Tradeoff, sharding, message queue: ran out of time.
2 difficulties:
Find nearby drivers
Quadtree
Wanted to know which points are the problem specific rubrics.
Audience Feedback
Audience: at 17th minute, we have not started high level design yet. We can start from high-level design
Interviewer:
GeoHashing, using SQL to query based on radius
List of drivers. Prefix of region.
Quadtree is a bit more complex
Every large grid is broken into smaller grids
If we need more accuracy, we go into smaller grids.
If we don’t have enough drivers, we can backout to larger grids
We can use radius as arguments
Alex Xu: LBS, yelp.
Moving across borders is complex to handle
Interviewee:
DriverLongLat is fine
DirverInGrid has more latency, about 15-20 seconds
Microbatch aggregation
DriverInGrid: for sending requests to nearby drivers
DriverLongLat: to know exact location of matched driver
Database is for definition of grids
“DriverInGrid” only has driver -> gridId
Audience: how to select drivers?
Interviewer: based on rating, location, etc. we send to 3 drivers. In 10-15 seconds we can send it to another batch of 3 drivers. Tradeoff of how many drivers vs how long to wait.
“grokking”
Interviewer: no time for single-point of failure
Wanted: go through end-to-end high level. Drill down too quickly.
Can draw a high level diagram first
Interviewee: Can we assume that we can use multiple machines?
Interviewer: some other interviewers may stop you and direct you to the interesting points. My style is I hope to see how you control the time.
Audience: we usually will use mobile app. Can use GRPC
Interviewer: every 3 seconds we push location, so we can piggy bag the notification for request