Trees Indexes I
Prev: hash-tables Next: trees-indexes-ii
Indexes
A table index is a replica of a subset of a table’s columns organized in a way that allows the DBMS to find tuples more quickly than performing a sequential scan.
If there’s an index on a column, it’s the DBMS’ job to figure out the best indexes to use to execute queries.
B+ Trees
A B+Tree is a self-balancing tree data structure that keeps data sorted, and allows for access operations in . It is optimized for disk based storage that read/write large blocks of data.
A B+ Tree is an -way search tree with the following properties:
- Perfectly balanced (every leaf node is at the same depth)
- Every inner node other than the root is at least half full.
- Every inner node with keys has non-null children.
Every node in a B+ Tree contains an array of key/value pairs:
- Arrays at every node are (almost) sorted by the keys.
- The value array for inner nodes will contain pointers to other nodes.
- Leaf node values can either contain:
- Record IDs: A pointer to the location of the tuple.
- Tuple Data: The actual contents of the tuples.
Insertion
To Insert:
- Find correct leaf .
- Add new entry into in sorted order:
- If has enough space, finish.
- Else, split into two nodes and . Redistributed entries evenly and copy the middle key and put in . Insert index entry point to into parent of .
- To split an inner node, redistributed entries evenly, but push up the middle key.
Deletion
- Find correct leaf .
- Remove the entry:
- If is still at least half full, finish.
- Otherwise, redistribute, borrowing from sibling.
- If that fails, merge and sibling.
- If merge occurs, delete entry in parent pointing to .
B+ Tree Design Decisions
Node Size:
- The optimal size for a B+ Tree depends on the speed of the disk. The slower the disk, the larger the ideal node size.
- Workloads also affect the node size. Scan-heavy workloads like larger nodes, whereas single key lookups prefer smaller node sizes.
Merge Threshold:
- Delaying merging operations reduces the overhead of reorganization.
- For example, underflows might be ignored, and the tree can be rebuilt periodically to deal with this.
Variable Length Keys:
- Pointers can store keys as pointers to the tuples attribute (rarely used)
- Variable Length Nodes: The size of each node in the B+ Tree can vary, which complicates implementation.
- Key Map: Embed an array of pointers that map to the key+value list within the node.
Non-Unique Indexes:
- Duplicate Keys use the same leaf node layout, but store duplicate keys multiple times
- Value Lists store each key only once and maintain a linked list of unique values.
Intra-Node Search:
- Linear scanning: Scan key-value nodes from beginning to end. This doesn’t require entries to be sorted, which is an advantage.
- Binary: to the size of the entry list, but requires presorting the key-value entries.
- Interpolation: Approximate the starting position of the search key based on the known low/high values in the node, and linear searching from there. This requires keeping statistics in the node.
B+ Tree Optimizations
Prefix Compression:
- Sorted keys in the same leaf node are likely to have the same prefix.
- Extract that prefix and only store unique suffixes.
Suffix Truncation:
- The keys in the inner nodes are only used to direct traffic, so we can store just a prefix.
Bulk Inserts:
- The fastest way to build a B+ Tree from scratch is to first sort the keys and then build the index from the bottom up.
- This is faster, since there are no splits or merges.
Prev: hash-tables Next: trees-indexes-ii