Ads Targeting System

Topic: Ads Targeting System

Interviewer: LW(gulingwei852)

Interviewee: 刘君洁

Level: L4 (Experienced Individual Contributor)

Mock System Design Interview Summary

QRCode for activities and future coaching

Interview Overview

Date: 8/7/2022

Target level: L4

Duration: 45 minutes

Topic covered: Ads targeting system

Drawing tool used: ?

Requirements

Functional requirements

Targeting users

Need to tag the user; they get different behavior

Hundred of tags for users

Millions of users

APIs:

AddTag(tag_info, user_id)

tags_infos get(user_id)

User_id get(tag_info)

Attach tags to users

For specific user: return the list of tags, and property of tags

Q: Do we need anything else on user?

A: User ID & tags on them. No other

Q: what is on the tag?

A: different tags have different metadata information. No overlap

Non functional requirements

Upstream services have different level of requirements for latencies:

Some need milisecond fast response: e.g. subscription for paid service

Some does not need fast response.

Reads should all be very fast

Writes doesn’t need very fast response

QPS: for add and get

For reads: offpeak 5k QPS. peak hours 25k QPS

For writes: 2K QPS. 1 request may contain millions of users

Minimum consistency is enough

Add MQ and cache

When adding tag, we need to invalidate cache

Interviewer: MQ is still needed because there may be spikes

Interviewer: what’s in the cache?

A: user_id -> tag_ids. Tag_id -> user_ids

Q: If we need a very fast response, do we need to go through the queue?

A: We need to fanout; if we don’t go through the queue, there may be timeouts

Q: some requests may be smaller and need faster response. Do we have the same message queue for smaller and larger requests?

A: we can have 2 message queues.

We can use relational database

Q: there may be spikes in the writes

A: the message queue can be the buffer the spikes

Interviewer: examples:

A user subscribes to a service; a request may just contain one user with one tag

A campaign may tag millions of users with one tag

Interviewee: are there multiple tags for a campaign?

Interviewer: one tag

Interviewer: why not noSQL?

Interviewee: SQL, blobStore, noSQL

Interviewer: what is a blob store?

Interviewee: can store very large objects like gigabytes

Interviewer: what type of NoSQL? What schema to use?

Interviewee: key value store

Key: Tag_name/User_id

You can shard by the hash of the tag name or user ID.

Value: can be an array of IDs, or a string which points to a blob store

Q: why do you still use Blob store?

A: one tag may have millions of users

Tag_id, tag_info

We still need to retrieve users by tag. In this case we need to have a large value that contains

Interviewee: Key value store: I only know react database

Interviewer: How do we implement consistency

Interviewee: compared to write, the reads are more frequent. After the update is successful, we will invalidate the cache. This ensures the cache has the most up-to-date data

If we have a very large request, the system can be slow.

Interviewer and Audience Feedback

Audience:

User goes to website

Javascript -> give me an ad

It is the most commonly used API

Feedback

Interviewer:

Reasonable flow

Needed a lots of hints

I asked about QPS and traffic for databases. Interviewee was distracted.

Interviewee didn’t estimate the traffic

Interviewee

I didn’t understand the requirements. Spent a lot of time on requirement clarification

If there were a set of clear APIs, it could be smoother

After more than 10 minutes of

===

Interviewer:

Ideal design

Some requests need fast response - can directly to backend

Some requests doesn’t but they are very large.

NoSQL probably makes more sense than SQL

Wanted to discuss more

NoSQL database:

Tag to user

User to tag

From tag to user: the key is the tag, the value is a huge list of users?

Interviewee:

It feels SQL database can handle XXk QPS

Audience

NoSQL, SQL are both fine

SQL: key and user ID can be combined into a key

SQL can support Prefix_scan

Partition:

You can scan by prefix within a partition

If we partition by tag. Once we find the partition, then we will find the tag including all users

===

In real work getUserByTag is rarely used

But getTagsByUser is more often used

SQL: read-heavy

Some SQLs don’t use B+ tree but use SSTable

1 million records has very small storage size

SQL:

User_id, tag_id

Sharded SQL is fine.

Don’t use noSQL.

2 tables

===

Audience:

If the interviewer is sure SQL is sufficient, why not express it with confidence in the interviewer? What are the tradeoffs?

Audience:

You can define a schema. This can clarify.

Interviewee:

Many to many table

SQL and NoSQL are both fine. I was more confident for SQL.

Interviewer:

NoSQL: if there are bigger write traffic, then NoSQL can handle it more easily

I agree that many by many relation is better handled by SQL

===

Design tradeoff:

Tag meta

User meta

Tag_id, user_id

Difficult to maintain consistency across 2 tables

Cassandra?

Redis is fine as it’s single thread. LUA script

NoSQL: cannot handle this user scenario. Hard to handle many-to-many

Can use Redis.

NoSQL:

tag_id + user_id can become a key

Ads: tag may include “sports”, “basketball”

Key = sports, value = basketball

How do we find tag by users? And user by tag?

Then we need 2 tables.

NoSQL hard to handle consistency

We can handle this using the application.

It’s all code. Move it to application.

Database cannot ensure commit across partitions or machines.

We can solve with application

Before you write, we need to write the state in a different database.

Application can implement 2PC.

Go across microservice.

Let’s say there is a tag service, user service.

Audience:

Using SQL is the simplest. NoSQL is much harder

===

Audience:

NoSQL: all NoSQL are easy to partition

Document DB, graph

Audience:

LSM tree doesn’t support reading very well. It needs to scan through segments. Cassandra may have some examples

Optimize for write

Audience:

Read heavy btree + rdbms=postgre

Write heavy lsmtree + rdbms=mysql+myrocks

Read heavy btree + nosql = mongodb

Write heavy lsmtree + nosql = cassandra

Audience

Cassandra, dynamo DB, are both based on LSM. Reads are slow.

Audience:

To partition SQL, people need to manually partition.

NoSQL handles partitioning natively.