Inventory Management

System DesignE-commerce & Payments

Topic: Inventory Management

Interviewer: desmond

Interviewee: guanyao

Level: L5 (Senior)


Mock System Design Interview Summary

Interview Overview

Date: 3/27

Target level: L5

Duration: 45 minutes

Topic covered: Inventory management like instacart

Drawing tool used: excalidraw

Requirements

Functional requirements

Inventory management like instacart

Make sure we don’t overshop

Shipment coming in

Browse, add to cart, at the same time update availability, no over selling

Out of scope

No payment, shipment

Inventory service, we already have item in

Neighboring team - catalog team. Already handle catalog

Manage inventory and basket

  1. Incoming shipment to add to inventory

  2. User can browse, search. Search and browse for bare minimum

  3. Add to cart, keep something (lock/reserve)

(Really want to focus on inventory management)

Non functional requirements

Availability

High throughput

Service reliability

System Design

10:00

External APIs

API:

add_product(product, number)

Add_to_cart(user, product, number)

checkout(user, product, number)

11:00

Database design

User_purchse_table

User_id, product_id, number, status

Status: in_cart, bought

Product_table

Product_id, sku_number, total_number

Single service logic:

Add_to_cart: do db txn with 2 modification, and commit

For transaction: mysql. May have problem, the size may be too big

15:00

Are all APIs on the same server? Who is calling those APIs?

A truck coming in with a box of banana. There is a person with scanner.

What does “add product” do? Yes. we can call add product for the above situation

If we have never sold iPad, now we start to sell iPad.

Maybe register product. But not sure distinction

User_purchase_table

product_table

Sharding with MySQL can be hard

Instead of one transaction to deal with 2 tables, we can do something different.

Treat as idemponent key: unique id: lock N products when number of products equal>=N. CAS

Server logic:

Modify product_table if products number >= N

If server logic succeed:

Modify user_purchase_table, then return success

Else:

Return true

Q: When you shard it. If it’s same db is fine to use transaction support.

Now we are using distributed transaction.

A: Break transaction into 2 phases.

Various crash scenarios, because data relevant to the transaction are separate in 2 databases

We have idempotent key for user behavior. If first step succeed, second step may fail

Idempotent key has table: idempotent_key, status

If we need to give up after we reserve banana. How do we rollback?

Have global key for idempotent key

Chron job to periodically check: consistency between idempotent_table and product_table

If there are many stores, and they are busy. Talk me through sharding

Most important is the product_table

How do you shard?

We can shard based on category and product name

Downside: read-write heavy. Not good for hot products

We can cache the product

Interviewer and Audience Feedback

Audience feedback

Interviewer:

Inventory management for grocery

Instacart: cannot sell more than the inventory.

Consistent.

Senior:

Leading the discussion

Which points are most important

Important points:

Managing inventory

Abandoned cart, expiry. A hidden requirement

Transaction - consistency

Possible:

Distributed transaction

2PC, saga

Idempotent

Can proactively provide the possible solutions

Roll-forward, rollback.

Touched on the points

Most of these are based on data. Today, we are missing data estimation

Everybody has reservation. How many update. What’s the granularity update? Can we use in-memory database?

Which one is better?

Relational database. Every store, every item, may not have a lot of entries.

We can shard on store.

Using business requirement to optimize

Shopper fulfillment and inventory

Shopper: more API.

Should use picture

Shipment acceptance:

spike

There are multiple solutions

Some may be very specific. But big companies just look at big picture but not specific skills

Requirement gathering. We can simplify the solution, senior should drive simplification

Finding the key points of discussion. QPS. storage. Hit rate, locality. Prove your own design.

Present the big picture. Search can be separated from this system.

Weak pass for intermediate level

===

Interviewee

Q: Difference between adding a new item, vs adding a new product.

A: Add item to category is separate from add item to inventory.

Q: sharding. We can use SQL. For distributed transaction, how do we do?

A: NoSQL.

We can do distributed transaction. E.g.

Orchestrator for distributed transaction

Complexity in team.

2-phase commit is relatively slow

QPS is not going to be very high

SAGA: transaction log, decide where we are. Roll-forward or roll-backward.

Need to write transaction log.

Inventory table - need versioning.

Need orchestrator, or transaction log.

Justify relation. But we should do some calculation

Otherise, we can just subtract number from everyone’s cart

扬长避短

Distributed transaction, 2PC, SAGA

There may be service frontend for two services.

Product requirement

Why do we need to lock

Sometimes we may use substitute

Black store: we have more control. Substitution - product impact

This is more related to black store / warehouse

Add cart, reserve.

Interviewee:

A: How to do cache?

B: traffic pattern, read/write ratio, locality, hot/cold data

Justification.

Write-aside, write through

Which system writes to cache.

Invalidate. Can provide more details

Interviewee:

Biggest impression: distributed transaction is hard. We can take out some framework

We can provide some transaction methods

E.g. dynamoDB can provide distributed transactions

Most popular:

Saga

Assume distributed transaction can solve this.

If we don’t use RDBMS, what can we discuss?

We can discuss expiry. In-memory. We don’t have to compute the total number.

You can lead the discussion as a senior engineer.

2PC, spanner.

Before diving to details,draw the big picture

Need to draw UI

Should draw the UI

Bookkeeping are important

Dive too deep into transactions

Interviewer: thinks mysql can solve the problem

Sharding, replication, index.

Q: distributed transaction

2PC vs saga

It’s a design pattern. Roll forward, rollback.

It does not guarantee isolation. Temporarily the data is not right.

When we buy things, do we save this to backend cache?

Load balance: backend server. How do we memorize these?

We ideally want to set up stateless server

Redis, elastic cache.

Redis: single instance is the strongest. If you need durability, you can use replicated/distributed redis.

Single master, single instance, quorum, paxos, raft. Leader election mechanism.

Is it cache or storage? Redis is quite durable.

If we use RDBMS, then things are simple.

If there is no locality and miss rate is high, then cache is not useful.

Sticky routing: fairly rare.

Complexity, hardware cost, team cost.

We can use long polling. We don’t have to use realtime feedback. Over-engineering

Cache or database.

High speed cache.

All carts have expiration.

In-memory database + replication (providing durability)

Every shop does not have a lot of people, so we can calculate in realtime.

Cache -> roadmap with transaction

=

Put all order in cart

userID + cart

Cart may have item A

Cart -> split into user, item, store, cart, reservation number. Multiple entries.

Reservation count storage

We don’t need to worry about expiration.

TTL. How to add the reserved number?

If we see TTL expired, we can remove the item, and not add the count

Alternatively, we can use a sleeper to poll. Add the things back.

Reservation table, primary key?

Store, SKU, cart ID, product ID, reservation count

Primary key is concatenation of the 4

Cart: should add based on store, SKU. Range scan.

Black warehouse. Popular items. Say apple is most popular.

Can we set store, sku as primary key, the rest are repeated.

It makes things more complex: Normalization, denomalization. Locking, optimistic locking.

DynamoDB is expensive: you may as well use RDBMS. RDBMS can be tuned as well.

==

If we store and shard by store, then retrieval is hard

You need to go through the shards.

We can materialize view. Enrichment. Precalculate the additional information.

Can we use cart as primary key?

Cart + map

Cart + reservation number

Secondary index: equivalent of more tables

Global secondary index: internally there are 2 tables. It’s eventual consistency. It uses distributed transaction.

如何准备?

DDIA - backburner

Alex Xu - 2 books

Tech Dummies - it gives a brief introduction

Come to mock

以面代考

Pattern of distributed system

DDIA

New book - stock exchange, payment, google map

If we hit a product that you have not used.

If you don’t know the design at all:

Start from user, end with user

Big picture design

Find the things that I am familiar with, and discuss about it. E.g. how to split the microservice

15 companies

Mock, listen, read, real interview

Grokking: relatively shallower

Alex Xu: deeper, but still some area not deep enough

Mock with each other: can get real architecture

==

Excalidraw, googledocs

Multiple cameras

Open tethering

Excel

Work from alpha

Open your IDE

Debugging, coding

Coding more

After SDE II, SDE III

Example: organization level. Executive. Data point

Cross functional. L6

Top music chat

Cross functional

Situation of conflict, example requirements of the person from the other side, how we pull the other side back.