Sorting
Prev: Hash Tables Next: Binary Search Trees (BSTs)
Problems
14.1
Write a program that takes two sorted arrays, possibly containing duplicates, and returns their intersection. The result itself should contain no duplicates. For example, if the input arrays are <2,3,3,5,5,6,7,7,8,12> and <5,5,6,8,8,9,10,10>, the answer is <5,6,8>.
Hint: Think separately about the cases where the arrays have similar sizes and where one array is much shorter than the other.
14.2
Two sorted arrays can be merged into one sorted array in linear time if there is enough space for the result. Instead of allocating a third array, suppose the first array has enough empty entries at its end to hold the second array.
Write a program that merges the second sorted array into the first sorted array in place.
Hint: Do not repeatedly shift large prefixes of the first array.
14.3
Suppose a collection of people is represented by an array of name records, where each record has a first name and a last name. Write a program that removes all entries with duplicate first names, leaving exactly one record for each distinct first name. For example, from
[(Ian, Botham), (David, Gower), (Ian, Bell), (Ian, Chappell), (Jeff, Dodger), (David, Bowie), (David, Gilmour)]
you may return any array that keeps one Ian, one David, and one Jeff.
Hint: Bring equal first names close together.
14.4
Suppose a calendar is represented as an array of events, where each event consists of a start time and a finish time. Write a program that returns the maximum number of events that occur simultaneously.
Hint: Focus on what happens at event boundaries.
Variant: Suppose user i consumes bandwidth b_i from time s_i through time f_i, inclusive. Compute the maximum total bandwidth in use at any time.
14.5
You are given a sorted array of disjoint closed intervals and one new closed interval. Write a program that inserts the new interval and returns the union. For example, if the input intervals are [-4,-1], [0,2], [3,6], [7,9], [11,12], [14,17] and the new interval is [1,8], the output should be [-4,-1], [0,9], [11,12], [14,17].
Hint: Characterize when two closed intervals overlap.
14.6
Write a program that takes a collection of intervals whose endpoints are integers and whose left and right endpoints may each be open or closed, and returns their union as a set of disjoint intervals.
Hint: A careful case analysis at equal or touching endpoints is enough.
14.7
You are given an array of student records, each of which includes an age. Rearrange the array so that students of the same age appear together. The relative order of the age groups does not matter.
Hint: Count how many students have each age.
Variant: How would your solution change if the age groups had to appear in sorted order by age?
14.8
Two teams must be arranged for a photograph, one in a front row and the other in a back row. Every player in the back row must be strictly taller than the corresponding player in the front row.
Given the heights of the players on the two teams, determine whether such an arrangement is possible.
Hint: Try sorting the heights and comparing the only arrangement that could work.
14.9
Arrays can be sorted efficiently with algorithms such as heapsort and quicksort, but these rely on random access. Write a program that sorts a singly linked list efficiently. The algorithm should be stable.
Hint: Exploit the operations that linked lists do support efficiently.
14.10
A company wants to reduce payroll by imposing a salary cap: anyone earning more than the cap is paid exactly the cap, while everyone else keeps their original salary. Given an array of salaries and a target payroll, compute the cap that makes the resulting total payroll equal to the target. For example, for salaries 90, 30, 100, 40, 20 and target payroll 210, the correct cap is 60.
Hint: How does the total payroll change as the cap increases?
Variant: Solve the problem using only O(1) additional space.
Answers
14.1
If the arrays are of comparable size, scan them with two pointers, advancing the pointer at the smaller value and recording matches only when they differ from the previous output. This is O(m + n). If one array is much shorter, iterate through its distinct values and binary-search the longer array, giving O(m log n).
14.2
Merge from right to left. Keep indices at the last valid entry of the first array, the last entry of the second array, and the last slot of the destination. Repeatedly place the larger of the two candidate elements into the destination slot. This avoids shifting and runs in O(m + n) time with O(1) extra space.
14.3
Sort the records by first name, then do a linear compaction pass that keeps only the first record from each run of equal first names. The sort dominates, so the running time is O(n log n).
14.4
Convert each event into two endpoints, sort the endpoints by time, and break ties so starts are processed before finishes when events sharing that time should count as overlapping. Sweep through the sorted endpoints, incrementing on a start and decrementing on a finish, while tracking the maximum active count. The bandwidth variant uses the same sweep, except each endpoint adds or subtracts b_i.
14.5
Walk through the intervals in three phases: copy all intervals that end before the new one starts, merge every interval that overlaps the new one by expanding its endpoints, then copy the remaining intervals. Because the input is already sorted and disjoint, this takes O(n) time.
14.6
Sort intervals by left endpoint, breaking ties so closed left endpoints come before open left endpoints. Then scan from left to right, merging each interval into the current union interval whenever they overlap or touch in a way that the endpoint openness still makes them connected. Otherwise, start a new output interval. The cost is O(n log n) for the sort.
14.7
Count how many students have each age, then compute the start offset of each age bucket. While some bucket is not yet filled, swap the student currently sitting at that bucket’s next open slot into the correct bucket using the offset tables, and update counts or next-free positions as buckets fill. This is O(n) time and O(m) extra space for m distinct ages. If age groups must be sorted, maintain the counts in age order before assigning offsets.
14.8
Sort both teams’ heights. The shorter team’s sorted order must match against the taller team’s sorted order elementwise; if every height on one side is strictly less than the corresponding height on the other, the photo is possible, otherwise it is not. Sorting gives O(n log n) time.
14.9
Use mergesort on the linked list. Split the list into two halves with slow and fast pointers, recursively sort the halves, then merge the two sorted lists with the standard stable list-merge routine. This gives O(n log n) time and O(log n) stack space.
14.10
Sort the salaries and scan from smallest to largest while maintaining the sum of uncapped salaries already fixed. At position i, if setting the cap somewhere between salaries i - 1 and i would make the total payroll match the target, solve
prefix_sum + cap * remaining_count = target
for cap and return it. If not, absorb the next salary into the prefix sum and continue. The running time is O(n log n) because of the sort.