Ebay Auction

System DesignE-commerce & Payments

Topic: Ebay Auction

Interviewer: Tom (慢慢失败)

Interviewee: shihao (沉)

Level: L5 (Senior)


Topic

Mock System Design Interview Summary

Interview Overview

Date: 1/9/2022

Target level: L5

Duration: 1 hour

Topic covered: Ebay Auction

Drawing tool used: Draw.io/Diagram

Requirements

Functional requirements

Can create products

Can have auctions / bid

Keywords and products

No cheating

Seller report lower PR

User and order service available

Interviewer:

Focus on auction

Interviewee:

Change auction begin and end should be prevented 1 hour before auction end

Interviewer:

Assume auction is immutable

Non-functional requirements

DAU 100M 10% 10 million/day

read/write ratio 88:2:10 readproduct/createproduct/makebid

QPS: 10k/s -> 30k/s

Machine: 1 container ~ 1k QPS. 30 replication

(QPS based on experience)

Data storage: 2 million product/day -> 10KB -> 2GB/day

System Design

System design diagram

External APIs

/products ->

GET /products/

DEL /products/

POST /products

GET /products?keyword=<key_word>&page_size=…

/auctions

POST /auctions?product_id=<product_id>

System Design Diagram

3 layers:

Stateless service layer. Service mesh architecture. Istio and envoy,

Met down, logging, tracing, rate-limiting

Caching layer

Persistence layer

Database schema

Auction

System design diagram

Interviewee: will design search next.

Interviewer: please move on to how user makes a bid

GET /auctions?product_id

Schema

Database choice

Consistency

Hbase

How to handle auction in more details?

Key is to handle high throughput

“Local cache”: cached max. Can be used to reject bids

Redis-cluster: global max

Interviewer: who updates the redis-cluster?

Redis-cluster:

What happens if lots of requests with $10

Reconsolidation service in the end, take the earliest $10

Mutable? Immutable? How to present to the user?

Discussions during the Interview

1 user makes a bid

2 compared against the local cache to ensure it’s higher than cached max

3 insert bid to HBase

4 async process “auction reconciliation” will update global max at redis

Flow for auction:

Atomicity operation:

Compare and swap

Interviewer and Audience Feedback after the Interview

Interviewer:

Strong skill, strong know

炫技

We may not need to guarantee strong consistency

Solution - is not very convincing

Improve the diagrams: present the workflow

Hard to see

Audience

Auction - ebay 产品

Some auctions are different

Close envelope price

Bidding, vs auction

Audience

Gathering Requirement

Lost in functional requirement

Design ebay platform

Suggest to interviewer: which requirement is most important

Requirement 可能走偏

Looks like the interviewee knows the question ahead of time

The interviewee didn’t clarify which feature is the main feature

没有突出重点

For example, how to handle multiple biddings at the same time

Ideally:

Start with user,

search engine, create product,

Interviewer

Functional requirement is fine

Don’t understand why we go into search too much

Audience:

Basic is better

Implement the basic

Try to find key flow

“If we have more time, we can discuss other flows”

Can ask requirement in more details

User can create product, lowest bidding price

What happens if 2 users have the same price

When searching: should I specify a time range?

QPS:

Where we want to use a message queue, where we want to use cache. Tradeoff

Detail design: which one is cassandra, dynamo, memory cache: cassandra

Interviewee

description, dynamic schema: no sql (instead of sql)

Update rather low: cassandra or dynamo

Single line insert

LSM tree

HBase

Audience: sql. Rock-database

Hardskill

Redis, nosql database

Redis: sorted set

Cassandra: price can be a key

Consistency: not in requirement gathering

Is there a consistency issue?

Different people/geography may see different highest price

Multiple db globally

Vs single db?

US vs Australia

Is it global service, or regional service?

Audience

Redis: sorted set-> can it solve consistency?

Audience

No. It depends on where you put Redis. Usually there are multiple instances of Redis.

Redis cluster is usually in one availability zone

Redis cluster: globally. New York, Australia, Paris. There is a long delay.

High consistency globally. Redis may not be able to achieve it.

Global consistency: is difficult

May clarify in requirement whether consistency is required

Which are the key test points?

How to ensure buyers to see the latest price? Pull vs push

Concurrent bid? How to handle

Bid legitimacy

How to summarize highest bid

High concurrency

===

There is no workable solution?

Share a workable solution.

===

Interviewee:

I initially thought the key test point is to protect against high throughput

===

Alternative design:

Put message queue to buffer the incoming bids in front of the HBase

Check the current highest price.

Can use redis to ensure the highest bid. Don’t enter message queue

Handle redis crash through backup - may

Then send to message queue

Append only - doesn’t require lock

Leaderboard is only cassandra

Why need to guarantee FIFO?

Likely 2 people bid the same price. FIFO

Message queue is fine. Redis does not need to be most up to date.

#Bidding product -> global max -> message queue -> streaming process ->

bid –> MQ –> Stream process –> MQ –> Worker –> DB

===

Workable solution:

Message queue

Update cache value. Just delete the cache value.

Every 2 seconds update the max in redis

HBase: source of truth

Eventual consistency

Audience

Key improvements

  1. Web Socket to get the latest

  2. Flink, message queue, get global max, send to message queue

  3. Worker

P50 1 second: bridge

Stream process -> distributed, can spin up more copies