Hash Tables

A hash table stores key-value pairs and uses a hash function to choose where a key should live.

The basic flow is:

Hash the key.
Map the hash to a bucket.
Store or search within that bucket.

With a good hash function and a controlled load factor, lookup, insertion, and deletion are $O (1)$ on average.

Collisions

A collision happens when two keys map to the same bucket. Common strategies:

Separate chaining: each bucket stores a small list, tree, or vector of entries.
Open addressing: entries live directly in the table, and collisions probe for another slot.

Collisions are why hash tables have a worst case of $O (n)$ for lookup and insertion. Some implementations use balanced trees inside large buckets, which can make collision-heavy lookup $O (lo g n)$ instead.

Load Factor

The load factor is roughly:

\frac{number of entries}{number of buckets}

When the load factor gets too high, the table is resized and entries are rehashed into a larger table. This resize costs $O (n)$ , but if growth is geometric, insertion remains $O (1)$ amortized.

When to Use

Use a hash table when exact-match lookup matters more than ordering.

Good fits:

caches
dictionaries
sets
deduplication
counting frequencies
indexes for exact keys

Poor fits:

sorted iteration
range queries
prefix search
nearest-neighbor search

Takashi's Notes

Explorer

Hash Tables

Hash Tables

Collisions

Load Factor

When to Use

Graph View

Table of Contents