Hash Tables
A hash table stores key-value pairs and uses a hash function to choose where a key should live.
The basic flow is:
- Hash the key.
- Map the hash to a bucket.
- Store or search within that bucket.
With a good hash function and a controlled load factor, lookup, insertion, and deletion are on average.
Collisions
A collision happens when two keys map to the same bucket. Common strategies:
- Separate chaining: each bucket stores a small list, tree, or vector of entries.
- Open addressing: entries live directly in the table, and collisions probe for another slot.
Collisions are why hash tables have a worst case of for lookup and insertion. Some implementations use balanced trees inside large buckets, which can make collision-heavy lookup instead.
Load Factor
The load factor is roughly:
When the load factor gets too high, the table is resized and entries are rehashed into a larger table. This resize costs , but if growth is geometric, insertion remains amortized.
When to Use
Use a hash table when exact-match lookup matters more than ordering.
Good fits:
- caches
- dictionaries
- sets
- deduplication
- counting frequencies
- indexes for exact keys
Poor fits:
- sorted iteration
- range queries
- prefix search
- nearest-neighbor search