Hash Tables

A hash table stores key-value pairs and uses a hash function to choose where a key should live.

The basic flow is:

  1. Hash the key.
  2. Map the hash to a bucket.
  3. Store or search within that bucket.

With a good hash function and a controlled load factor, lookup, insertion, and deletion are on average.

Collisions

A collision happens when two keys map to the same bucket. Common strategies:

  • Separate chaining: each bucket stores a small list, tree, or vector of entries.
  • Open addressing: entries live directly in the table, and collisions probe for another slot.

Collisions are why hash tables have a worst case of for lookup and insertion. Some implementations use balanced trees inside large buckets, which can make collision-heavy lookup instead.

Load Factor

The load factor is roughly:

When the load factor gets too high, the table is resized and entries are rehashed into a larger table. This resize costs , but if growth is geometric, insertion remains amortized.

When to Use

Use a hash table when exact-match lookup matters more than ordering.

Good fits:

  • caches
  • dictionaries
  • sets
  • deduplication
  • counting frequencies
  • indexes for exact keys

Poor fits:

  • sorted iteration
  • range queries
  • prefix search
  • nearest-neighbor search