Inventory management system
Materials — open to everyone, no sign-in
Topic: Inventory management system
Interviewer: desmond
Interviewee: guanyao
Level: L5 (Senior)
Additional Resources:
System Design Presentation - Azure Cosmos Database
Join Us on Wechat
Subscribe to Our YouTube channel https://commitway.com/eventyoutube
===
Fully managed
High resiliency 99.999%
Low latency <10ms read/write latency @P99
Elastic scale:
throughput, 100rps - trillion+ rps
Storage 50GB-PB+
Geo (single to all azure regions
Tunable consistency
Consistency
Eventual, consistent-prefix, session, bounded Staleness, Strong consistency
Level
Azure regions, data centers, stamps, fault domain, clusters, machines
Machine, replica, cosmos database engine
Direct connection vs gateway mode
ServiceFabric: similar to kubernetes
MasterService: meta data (similar to etcd)
ServiceService: real data
[ Is there benefit for gateway mode? ]
Horizontal partitioning
50GB max per partition
How to improve resiliency
Partition set.
Each replica set contains 4 replica with 1 leader
Forwarder propagate changes to the leader
[Can any follower be a forwarder? At a given time, is there only one forwarder?]
Partition under a container
Global distribution
Peer-peer link can be established between every nodes in the system
Database engine
Unique data structures for indexing
B+ tree: battle tested choice for low write throughput
Problem when we need to support many indices. Writes are slow.
BW tree: similar to and different from LSM tree
Update in lock-free manner with compare and swap
How to handle write conflicts?
If we support strong consistency, we won’t have write conflicts
Classic B-tree requires locking to protect against data corruption. However this slows down performance
To improve write-throughput, we write delta on top of the base page
Conflict resolution: we will detect conflict during “compare and swap”
Delta records are written on SSD
Efficient index updates with BW=tree
[are indices separate BW-tree from the main tree?]
Conflict if 2 writes to the
“这肯定是啊。你也不能对多个索引cluster data啊”
Comparison Cosmos DB with DynamoDB
Dynamo DB offers an optional sort key. Data stored in a partition will be sorted
Dynamo DB supports global secondary index (may be limited to 6 indices). CosmoDB support index on all json field.
CosmosDB has more types of queries (different database types)
Supported use cases are very similar
Pros and cons of NoSQL database:
flexible change of schema design
Horizontal scaling. Can scale out writes
Performance can be tuned based on specific workloads
CAP: often favors availability over consistency
Less query capabilities (not supporting join)
Pros and cons of SQL database:
ACID
Structured schema
Query language (can support join)
Performance bottleneck for large data
Use cases for SQL/NoSQL
SQL:
e-commerce, financial systems
Content management system
NoSQL
Social media.
Graph
Chat application
Big data
Cassandra for netflix big data
Internet of things
High write-throughput. E.g. philip hue uses dynamodb
Hybrid-case
Gaming industry
Redis for leaderboard
E-commerce with personal recommendations
===