Binary Search Trees (BSTs)
Problems
15.1
Write a program that takes as input a binary tree and checks whether it satisfies the BST property.
Hint: It is not enough to compare each node only with its immediate children.
15.2
Write a program that takes as input a BST and a value, and returns the first key that would appear in an inorder traversal which is greater than the input value. For example, for the BST in Figure 15.1 on Page 251, the answer for input 23 is 29.
Hint: Perform binary search while keeping additional state.
Variant: Return the node whose key equals the input value and appears first in inorder traversal. For the BST in Figure 15.2, the answer for 108 is node F, and for 143 it is nullptr.
15.3
A BST is a sorted data structure, so it should be possible to find the k largest keys easily.
Write a program that takes as input a BST and an integer k, and returns the k largest elements in the BST in decreasing order. For example, for the BST in Figure 15.1 and k = 3, the answer is (53, 47, 43).
Hint: What does an inorder traversal yield?
15.4
Design an algorithm that takes as input a BST and two nodes, and returns their LCA. For example, for the BST in Figure 15.1, the LCA of nodes C and G is B.
Assume all keys are distinct, and the nodes do not have references to their parents.
Hint: Take advantage of the BST property.
15.5
Suppose you are given the sequence in which keys are visited in an inorder traversal of a BST, and all keys are distinct. Can you reconstruct the BST? If so, write a program to do so. Solve the same problem for preorder and postorder traversal sequences.
Hint: Draw the five BSTs on the keys 1, 2, 3 and compare their traversal orders.
15.6
Design an algorithm that takes three sorted arrays and returns one entry from each such that the minimum interval containing those three entries is as small as possible. For example, if the arrays are (5,10,15), (3,6,9,12,15), and (8,16,24), then 15,15,16 lie in the smallest possible interval.
Hint: How would you proceed if you needed to pick the three entries from one single sorted array?
15.7
Numbers of the form a + b\sqrt{2}, where a and b are nonnegative integers, can be ordered. Design an algorithm for efficiently computing the k smallest numbers of this form.
Hint: Systematically enumerate points.
15.8
You are given a server log file containing billions of lines. Each line contains a number of fields. For this problem, the relevant field is an ID denoting a page that was accessed.
Write a function to read the next line from a log file, and a function to find the k most visited pages, where k is an input to the function. Optimize performance for the situation where calls to the two functions are interleaved. You may assume the set of distinct pages is small enough to fit in RAM.
As a concrete example, if page IDs appear as g, a, f, t, a, a, a, g, t, c, t, a, t, i, e, then after the first 10 lines the most common page is a with count 4, and the next most common is f with count 3.
Hint: For each page, count the number of times it has been visited.
Variant: Achieve O(1) time for read_next_line and O(k) time for find.
15.9
Given a sorted array, the number of BSTs that can be built on its entries grows rapidly, and some are much more balanced than others.
How would you build a BST of minimum possible height from a sorted array?
Hint: Which element should be the root?
15.10
BSTs support dynamic updates.
Design efficient functions for inserting and removing keys in a BST. Assume all elements are unique, and your insertion method must preserve the BST property.
Hint: Deleting leaves is easy. Pay attention to the children of internal nodes when you delete them.
Variant: Solve the problem with the added constraint that you can only change links. In particular, you cannot change the key stored at any node.
15.11
Write a program that takes two nodes in a BST and a third node, the “middle” node, and determines if one of the first two nodes is a proper ancestor of the middle and the other is a proper descendant of the middle. A proper ancestor of a node is an ancestor that is not equal to the node; a proper descendant is defined similarly. For example, in Figure 15.1, if the middle is node J, the answer is true for (A, K) and (J, M), and false for (I, P) or (J, K).
You may assume all keys are unique, and the nodes do not have pointers to their parents.
Hint: For what relative placements of the three nodes can the test succeed?
15.12
Consider a web-service that takes a geographical location and returns the nearest restaurant. The service starts with a set of restaurant locations, each with X and Y coordinates. One approach is to build two BSTs on the restaurant locations: one ordered by X and one ordered by Y.
Write a program that takes as input a BST and an interval, and returns the BST keys that lie in the interval. For example, for the tree in Figure 15.1 and interval [16,31], the answer is 17, 19, 23, 29, 31.
Hint: How many edges are traversed when the successor function is repeatedly called m times?
15.13
Consider a server that many clients connect to. Each client is identified by a string. Each client has a “credit”, which is a nonnegative integer value. The server must maintain a data structure to which clients can be added, removed, queried, or updated. In addition, the server must be able to add a specified number of credits to all clients simultaneously.
Design a data structure that implements the following methods:
Insert: add a client with specified credit, replacing any existing entry for that client.Remove: delete the specified client.Lookup: return the number of credits associated with the specified client.Add-all: increment the credit count for all current clients by the specified amount.Max: return a client with the highest number of credits.
Hint: Use additional global state.
Answers
15.1
Carry lower and upper bounds down the tree. A node is valid only if its key lies within the range implied by all its ancestors; recurse left with an updated upper bound and right with an updated lower bound. This checks the global BST property in O(n) time. An inorder traversal or BFS with bounds also works.
15.2
Search downward as in ordinary BST lookup while keeping the best candidate seen so far whose key is greater than the target. Move left when the current key is greater than the target and update the candidate; otherwise move right. The final candidate is the first inorder key greater than the input, in O(h) time.
15.3
Perform a reverse inorder traversal: visit right subtree, then root, then left subtree. Stop once k keys have been collected. This returns the k largest keys in descending order in O(h + k) time.
15.4
Starting at the root, compare both target keys with the current key. If both targets are smaller, go left; if both are larger, go right; otherwise the current node is the split point and hence the LCA. This uses the BST ordering directly and runs in O(h).
15.5
An inorder traversal alone is not enough: many different BSTs can have the same inorder sequence. Preorder and postorder do determine a unique BST when keys are distinct. Reconstruct by taking the root from the traversal, partitioning remaining keys into those less than the root and those greater than the root, and recursing. A bound-based single-pass reconstruction gives O(n) time.
15.6
Maintain one current element from each array in a balanced BST or other ordered structure. At each step, the current interval is from the minimum selected value to the maximum selected value. Record the best interval, then advance the iterator belonging to the current minimum. Stop when one array is exhausted. For three arrays this runs in linear time overall; more generally it is O(n log k).
15.7
One approach is best-first generation with a BST or set. Start from 0 + 0\sqrt{2}, repeatedly extract the smallest value, and insert the two successors formed by incrementing a or b, deduplicating with the set. This yields O(k log k). The chapter also gives an O(k) method that merges the next candidates from the already-generated sequence.
15.8
Store visit counts in a hash table, and maintain a BST ordered by (count, page) so updates and top-k queries remain efficient across interleaved operations. On each new log line, use the hash table to find the page’s old count, remove the old pair from the BST, increment the count, and insert the new pair. Then find k walks the BST backward from the maximum. The variant uses buckets by count plus hash-table indirection to get O(1) reads and O(k) reporting.
15.9
Choose the middle element of the sorted array as the root, then recursively build the left subtree from the left half and the right subtree from the right half. This keeps the subtree heights as balanced as possible and constructs a minimum-height BST in O(n) time.
15.10
Insertion is standard BST search followed by attaching a new leaf at the first null child position. For deletion, first find the node. If it has at most one child, splice that child into its place. If it has two children, replace it with its inorder successor or predecessor and then remove that successor/predecessor from its old location. The work is O(h).
15.11
Search from the two outer nodes toward the middle in an interleaved way. If one reaches the other outer node first, or both searches terminate without reaching the middle, the answer is false. Once one outer node reaches the middle, do one more BST search from the middle to the other outer node. This avoids unnecessary full-height traversals and takes O(d) when the successful ancestor/descendant relationship is close, and O(h) in the worst case.
15.12
Use the BST property to prune recursion. If the root key is less than the left endpoint, only the right subtree can contribute. If the root key is greater than the right endpoint, only the left subtree can contribute. Otherwise output the root and recurse into both sides. The running time is O(m + h) for m reported keys in a balanced tree.
15.13
Use lazy propagation for the global increment. Keep a single offset that represents the total of all Add-all operations. Store each client’s credit minus that offset in a hash table for Lookup, and also group clients by adjusted credit in a BST or ordered map so Max is easy. Insert and Remove update both structures, Add-all only changes the global offset, and Max returns any client in the largest adjusted-credit bucket.