chapter 13 design a search autocomplete system

Table of Contents

chapter 13 design a search autocomplete system

Prev: chapter-12-design-a-chat-system Next: chapter-14-design-youtube

This chapter outlines autocomplete, which is used at Google and Amazon.

Make sure to ask questions about the problem and scope: Q: Is the matching only supported at the beginning of a search query or in the middle as well? A: Only at the beginning of a search query. Q: How many autocomplete suggestions should the system return? A: 5 Q: How does the system know which 5 suggestions to return? A: This is determined by popularity, decided by the historical query frequency. Q: Does the system support spell check? A: No, spell check or autocorrect is not supported. Q: Are search queries in English? A: Yes. If time allows at the end, we can discuss multi-language support. Q: Do we allow capitalization and special characters? A: No, we assume all search queries have lowercase alphabetic characters. Q: How many users use the product? A: 10 million DAU.

Back of the Envelope Estimation

High level design

Split the system into two parts:

Data gathering service

As we query a new keyword, we add it into the database, along with its frequency (imagine a hashmap of key -> hit count).

use std::collections::HashMap;

let mut map = HashMap::new();
let count = map.entry(&str).or_insert(0);
*count += 1;

We can query the DB with SQL:

SELECT * FROM queries
where query like `prefix%`
ORDER BY frequency DESC
LIMIT 5;

This works fine for small datasets but hits a bottleneck in larger datasets.

Prev: chapter-12-design-a-chat-system Next: chapter-14-design-youtube