Index

Hash Indexes

Hash Indexes are generally good in key-value stores, where you want to check if a value exists in a set. For almost any other case, you’ll probably want to use a B-Tree.

Hash Indexes are an index that hashes some key which produces a 32-bit integer. This integer is then placed inside a bucket, where it resides until it is queried for.

At some point, a bucket may be split into two, as it has too many entries. Additional storage is requested by the index, and a bucket is split into two.

Performance vs B-Tree Index

Size of
Indexes

The B-Tree index is smooth, as it increases more incrementally in size.

The Hash Index is like a staircase, it allocates more in size and then stays at that size until it needs to allocate more.

The size of the indexes before reindexing:

db= \di+ shorturl*
          NameTypeTableSize
─────────────────────────┼───────┼──────────┼────────
shorturl_key_btree_index │ index │ shorturl │ 73 MB
shorturl_key_hash_index  │ index │ shorturl │ 37 MB
shorturl_url_btree_index │ index │ shorturl │ 53 MB
shorturl_url_hash_index  │ index │ shorturl │ 36 MB

Size after reindexing

db= REINDEX TABLE shorturl;
REINDEX
 
db= \di+ shorturl*
          NameTypeTableSize
─────────────────────────┼───────┼──────────┼───────
shorturl_key_btree_index │ index │ shorturl │ 56 MB
shorturl_key_hash_index  │ index │ shorturl │ 32 MB
shorturl_url_btree_index │ index │ shorturl │ 41 MB
shorturl_url_hash_index  │ index │ shorturl │ 32 MB

Inserting rows to a table indexed by a hash index is about 10% faster on postgres.

Reads were about 500% faster, though.

That being said, here are some limitations:

Hash indexes cannot be used for unique constraints:

sdb= CREATE UNIQUE INDEX shorturl_unique_key ON shorturl USING hash(key);
ERROR:  access method "hash" does not support unique indexes

Or multiple columns:

db= CREATE INDEX shorturl_key_and_url ON shorturl USING hash(key, url);
ERROR:  access method "hash" does not support multicolumn indexes

Nor sorted indexes:

db= CREATE INDEX shorturl_sorted_key ON shorturl USING hash(key desc);
ERROR:  access method "hash" does not support ASC/DESC options

or cluster a table:

db= CLUSTER shorturl USING shorturl_url_hash_index;
ERROR:  cannot cluster on index "shorturl_url_hash_index" because access method does not support clustering

or range lookups:

db= EXPLAIN (COSTS OFF) SELECT * FROM shorturl WHERE key BETWEEN '1' AND '2';
                                   QUERY PLAN
────────────────────────────────────────────────────────────────────────────────
Gather
  -> Seq Scan on shorturl
        Filter: ((key >= '1'::text) AND (key <= '2'::text))

or ORDER BY queries:

db= EXPLAIN (COSTS OFF) SELECT * FROM shorturl ORDER BY key LIMIT 10;
                   QUERY PLAN
─────────────────────────────────────────────────
 Limit
   ->  Gather Merge
         Workers Planned: 2
         ->  Sort
               Sort Key: key
               ->  Parallel Seq Scan on shorturl
(6 rows)