Greedy Algorithms

Prev: Dynamic Programming Next: Basic Graph Algorithms

Exercises

Caveat lector: Some of these exercises cannot be solved using greedy algorithms! Whenever you describe and analyze a greedy algorithm, you must also include a proof that your algorithm is correct; this proof will typically take the form of an exchange argument. These proofs are especially important in classes (like mine) that do not normally require proofs of correctness.

1.

The GreedySchedule algorithm we described for the class scheduling problem is not the only greedy strategy we could have tried. For each of the following alternative greedy strategies, either prove that the resulting algorithm always constructs an optimal schedule, or describe a small input example for which the algorithm does not produce an optimal schedule. Assume that all algorithms break ties arbitrarily (that is, in a manner that is completely out of your control). [Hint: Three of these algorithms are actually correct.]

(a) Choose the course x that ends last, discard classes that conflict with x, and recurse.

(b) Choose the course x that starts first, discard all classes that conflict with x, and recurse.

(c) Choose the course x that starts last, discard all classes that conflict with x, and recurse.

(d) Choose the course x with shortest duration, discard all classes that conflict with x, and recurse.

(e) Choose a course x that conflicts with the fewest other courses, discard all classes that conflict with x, and recurse.

(f) If no classes conflict, choose them all. Otherwise, discard the course with longest duration and recurse.

(g) If no classes conflict, choose them all. Otherwise, discard a course that conflicts with the most other courses and recurse.

(h) Let x be the class with the earliest start time, and let y be the class with the second earliest start time. • If x and y are disjoint, choose x and recurse on everything but x. • If x completely contains y, discard x and recurse. • Otherwise, discard y and recurse.

(i) If any course x completely contains another course, discard x and recurse. Otherwise, choose the course y that ends last, discard all classes that conflict with y, and recurse.

2.

Now consider a weighted version of the class scheduling problem, where different classes offer different number of credit hours (totally unrelated to the duration of the class lectures). Your goal is now to choose a set of non-conflicting classes that give you the largest possible number of credit hours, given arrays of start times, end times, and credit hours as input.

(a) Prove that the greedy algorithm described at the beginning of this chapter—Choose the class that ends first and recurse—does not always return an optimal schedule.

(b) Prove that none of the greedy algorithms described in Exercise 1 always return an optimal schedule. [Hint: Solve Exercise 1 first; the algorithms that don’t work there don’t work here, either.]

(c) Describe and analyze an algorithm that always computes an optimal schedule. [Hint: Your algorithm will not be greedy.]

3.

Let X be a set of n intervals on the real line. We say that a subset of intervals covers if the union of all intervals in is equal to the union of all intervals in . The size of a cover is just the number of intervals. Describe and analyze an efficient algorithm to compute the smallest cover of X. Assume that your input consists of two arrays and , representing the left and right endpoints of the intervals in X. If you use a greedy algorithm, you must prove that it is correct.

A set of intervals, with a cover (shaded) of size 7.

4.

Let X be a set of n intervals on the real line. We say that a set P of points stabs X if every interval in X contains at least one point in P. Describe and analyze an efficient algorithm to compute the smallest set of points that stabs X. Assume that your input consists of two arrays and , representing the left and right endpoints of the intervals in X. As usual, If you use a greedy algorithm, you must prove that it is correct.

A set of intervals stabbed by four points (shown here as vertical segments)

5.

Let X be a set of n intervals on the real line. A proper coloring of X assigns a color to each interval, so that any two overlapping intervals are assigned different colors. Describe and analyze an efficient algorithm to compute the minimum number of colors needed to properly color X. Assume that your input consists of two arrays and , representing the left and right endpoints of the intervals in X. As usual, if you use a greedy algorithm, you must prove that it is correct.

A proper coloring of a set of intervals using five colors.

6.

(a) For every integer n, find a frequency array whose Huffman code tree has depth , such that the largest frequency is as small as possible.

(b) Suppose the total length N of the unencoded message is bounded by a polynomial in the alphabet size n. Prove that any Huffman tree for the frequencies has depth .

7.

Call a frequency array -heavy if it satisfies two conditions:

for all ; that is, 1 is the unique most frequent symbol. • ; that is, at least an fraction of the symbols are 1s.

Find the largest real number such that in every Huffman code for every -heavy frequency array, symbol 1 is represented by a single bit. [Hint: First prove that .]

8.

Describe and analyze an algorithm to compute an optimal ternary prefix-free code for a given array of frequencies . Don’t forget to prove that your algorithm is correct for all n.

9.

Describe in detail how to implement the Gale-Shapley stable matching algorithm, so that the worst-case running time is , as claimed earlier in this chapter.

10.

(a) Prove that it is possible for the Gale-Shapley algorithm to perform offers before termination. (You need to describe both a suitable input and a sequence of valid offers.)

(b) Describe for any integer n a set of preferences for n doctors and n hospitals that forces the Gale-Shapley algorithm to execute rounds, no matter which valid proposal is made in each round. [Hint: Part (b) implies part (a).]

11.

Describe and analyze an efficient algorithm to determine whether a given set of hospital and doctor preferences has to a unique stable matching.

12.

Consider a generalization of the stable matching problem, where some doctors do not rank all hospitals and some hospitals do not rank all doctors, and a doctor can be assigned to a hospital only if each appears in the other’s preference list. In this case, there are three additional unstable situations: • A matched hospital prefers an unmatched doctor to its assigned match. • A matched doctor prefers an unmatched hospital to her assigned match. • An unmatched doctor and an unmatched hospital appear in each other’s preference lists.

A stable matching in this setting may leave some doctors and/or hospitals unmatched, even though their preference lists are non-empty. For example, if every doctor lists Harvard as their only acceptable hospital, and every hospital lists Dr. House as their only acceptable intern, then only House and Harvard will be matched. Describe and analyze an efficient algorithm that computes a stable matching in this more general setting. [Hint: Reduce to an instance where every doctor ranks every hospital and vice versa, and then invoke Gale-Shapley.]

13.

The Scandinavian furniture company Fürni has hired n drivers to deliver n identical orders to n different addresses in Wilmington, Delaware. Each driver has their own well-established delivery route through Wilmington that visits all n addresses. Assuming they follow their routes as they always do, two drivers never visit the same addresses at the same time. In principle, each of the n drivers can deliver their furniture to any of the n addresses, but there’s a complication. One of the drivers has secretly wired proximity sensors and explosives to the Johannshamn sofas (with the Strinne green stripe pattern). If two sofas are ever at the same address at the same time, both will explode, destroying both the delivery truck and the building at that address. This can only happen if one driver delivers an order to that address, and then later another driver visits that same address while the furniture is still on their truck. Your job as the Fürni dispatcher is to assign each driver to a delivery address. Describe an algorithm to assign addresses to drivers so that each of the n addresses receives their furniture order and there are no explosions. For example, suppose Jack’s route visits 537 Paper Street at 6pm and 1888 Franklin Street at 8pm, and Marla’s route visits 537 Paper at 7pm and 1888 Franklin at 9pm. Then Jack should deliver to 1888 Franklin, and Marla should deliver to 537 Paper; otherwise, there would be an explosion at 1888 Franklin at 8pm. (Cue the Pixies.) [Hint: Jack and Marla are a bit unstable.]

14.

Suppose you are a simple shopkeeper living in a country with n different types of coins, with values 1 = < < · · · < . (In the U.S., for example, n = 6 and the values are 1, 5, 10, 25, 50 and 100 cents.) Your beloved and benevolent dictator, El Generalissimo, has decreed that whenever you give a customer change, you must use the smallest possible number of coins, so as not to wear out the image of El Generalissimo lovingly engraved on each coin by servants of the Royal Treasury.

(a) In the United States, there is a simple greedy algorithm that always results in the smallest number of coins: subtract the largest coin and recursively give change for the remainder. El Generalissimo does not approve of American capitalist greed. Show that there is a set of coin values for which the greedy algorithm does not always give the smallest possible of coins.

(b) Now suppose El Generalissimo decides to impose a currency system where the coin denominations are consecutive powers b0, b1, b2,…, b k of some integer . Prove that despite El Generalissimo’s disapproval, the greedy algorithm described in part (a) does make optimal change in this currency system.

(c) Describe and analyze an efficient algorithm to determine, given a target amount T and a sorted array of coin denominations, the smallest number of coins needed to make T cents in change. Assume that = 1, so that it is possible to make change for any amount T.

15.

Suppose you are given an array of integers, each of which may be positive, negative, or zero. A contiguous subarray is called a positive interval if the sum of its entries is greater than zero. Describe and analyze an algorithm to compute the minimum number of positive intervals that cover every positive entry in A. For example, given the following array as input, your algorithm should output 3. If every entry in the input array is negative, your algorithm should output 0.

16.

Consider the following process. At all times you have a single positive integer , which is initially equal to 1. In each step, you can either increment or double . Your goal is to produce a target value . For example, you can produce the integer 10 in four steps as follows:

Obviously you can produce any integer using exactly increments, but for almost all values of , this is horribly inefficient. Describe and analyze an algorithm to compute the minimum number of steps required to produce any given integer .

17.

Suppose we have skiers with heights given in an array , and skis with heights given in an array . Describe an efficient algorithm to assign a ski to each skier, so that the average difference between the height of a skier and her assigned ski is as small as possible. The algorithm should compute a permutation such that the expression

is as small as possible.

18.

Alice wants to throw a party and she is trying to decide who to invite. She has n people to choose from, and she knows which pairs of these people know each other. She wants to invite as many people as possible, subject to two constraints: • For each guest, there should be at least five other guests that they already know. • For each guest, there should be at least five other guests that they don’t already know.

Describe and analyze an algorithm that computes the largest possible number of guests Alice can invite, given a list of n people and the list of pairs who know each other.

19.

Suppose we are given two arrays and of positive integers. An matrix of 0s and 1s agrees with R and C if, for every index i, the ith row contains 1s, and the ith column contains 1s. Describe and analyze an algorithm that either constructs a matrix that agrees with R and C, or correctly reports that no such matrix exists.

20.

You’ve just accepted a job from Elon Musk, delivering burritos from San Francisco to New York City. You get to drive a Burrito-Delivery Vehicle through Elon’s new Transcontinental Underground Burrito-Delivery Tube, which runs in a direct line between these two cities.12 Your Burrito-Delivery Vehicle runs on single-use batteries, which must be replaced after at most 100 miles. The actual fuel is virtually free, but the batteries are expensive and fragile, and therefore must be installed only by official members of the Transcontinental Underground Burrito-Delivery Vehicle Battery-Replacement Technicians’ Union.13 Thus, even if you replace your battery early, you must still pay full price for each new battery to be installed. Moreover, your Vehicle is too small to carry more than one battery at a time.

… and which was clearly modeled after Maciej Cegłowski’s fictional “Alameda-Weehauken Burrito Tunnel” or as they call themselves in German, Die Transkontinentaluntergrundburritolieferfahrzeugbatteriewechseltechnikervereinigung.

There are several fueling stations along the Tube; each station charges a different price for installing a new battery. Before you start your trip, you carefully print the Wikipedia page listing the locations and prices of every fueling station along the Tube. Given this information, how do you decide the best places to stop for fuel? More formally, suppose you are given two arrays and , where is the distance from the start of the Tube to the ith station, and is the cost to replace your battery at the ith station. Assume that your trip starts and ends at fueling stations (so = 0 and is the total length of your trip), and that your car starts with an empty battery (so you must install a new battery at station 1).

(a) Describe and analyze a greedy algorithm to find the minimum number of refueling stops needed to complete your trip. Don’t forget to prove that your algorithm is correct.

(b) But what you really want to minimize is the total cost of travel. Show that your greedy algorithm in part (a) does not produce an optimal solution when extended to this setting.

(c) Describe an efficient algorithm to compute the locations of the fuel stations you should stop at to minimize the total cost of travel.

21.

You’ve been hired to store a sequence of n books on shelves in a library. The order of the books is fixed by the cataloging system and cannot be changed; each shelf must store a contiguous interval of the given sequence of books. You are given two arrays and , where and are respectively the height and thickness of the ith book in the sequence. All shelves in this library have the same length L; the total thickness of all books on any single shelf cannot exceed L.

(a) Suppose all the books have the same height h and the shelves have height larger than h, so every book fits on every shelf. Describe and analyze a greedy algorithm to store the books in as few shelves as possible. [Hint: The algorithm is obvious, but why is it correct?]

(b) That was a nice warmup, but now here’s the real problem. In fact the books have different heights, but you can adjust the height of each shelf to match the tallest book on that shelf. (In particular, you can change the height of any empty shelf to zero.) Now your task is to store the books so that the sum of the heights of the shelves is as small as possible. Show that your greedy algorithm from part (a) does not always give the best solution to this problem.

(c) Describe and analyze an algorithm to find the best matching between books and shelves as described in part (b).

22.

A string w of parentheses (and) is balanced if it satisfies one of the following conditions: • w is the empty string. • w = (x) for some balanced string x • w = x y for some balanced strings x and y For example, the string w = ((())()()) (()()) () is balanced, because w = x y, where x = ((()) () ())

and

y = (() ()) ().

(a) Describe and analyze an algorithm to determine whether a given string of parentheses is balanced.

(b) Describe and analyze a greedy algorithm to compute the length of a longest balanced subsequence of a given string of parentheses. As usual, don’t forget to prove your algorithm is correct. For both problems, your input is an array , where for each i, either = (or =). Both of your algorithms should run in time.

23.

One day Alex got tired of climbing in a gym and decided to take a large group of climber friends outside to climb. They went to a climbing area with a huge wide boulder, not very tall, with several marked hand and foot holds. Alex quickly determined an “allowed” set of moves that her group of friends can perform to get from one hold to another. The overall system of holds can be described by a rooted tree T with n vertices, where each vertex corresponds to a hold and each edge corresponds to an allowed move between holds. The climbing paths converge as they go up the boulder, leading to a unique hold at the summit, represented by the root of T. Alex and her friends (who are all excellent climbers) decided to play a game, where as many climbers as possible are simultaneously on the boulder and each climber needs to perform a sequence of exactly k moves. Each climber can choose an arbitrary hold to start from, and all moves must move away from the ground. Thus, each climber traces out a path of k edges in the tree T, all directed toward the root. However, no two climbers are allowed to touch the same hold; the paths followed by different climbers cannot intersect at all.

(a) Describe and analyze a greedy algorithm to compute the maximum number of climbers that can play this game. Your algorithm is given

a rooted tree T and an integer k as input, and it should compute the largest possible number of disjoint paths in T, where each path has length k. Do not assume that T is a binary tree. For example, given the tree below as input, your algorithm should return the integer 8.

(b) Now suppose each vertex in T has an associated reward, and your goal is to maximize the total reward of the vertices in your paths, instead of the total number of paths. Show that your greedy algorithm does not always return the optimal reward.

(c) Describe an efficient algorithm to compute the maximum possible reward, as described in part (b).

24.

Congratulations! You have successfully conquered Camelot, transforming the former battle-scarred hereditary monarchy into an anarcho-syndicalist commune, where citizens take turns to act as a sort of executive-officerfor-the-week, but with all the decisions of that officer ratified at a special bi-weekly meeting, by a simple majority in the case of purely internal affairs, but by a two-thirds majority in the case of more major… As a final symbolic act, you order the Round Table (surprisingly, an actual circular table) to be split into pizza-like wedges and distributed to the citizens of Camelot as trophies. Each citizen has submitted a request for an angular wedge of the table, specified by two angles—for example: Sir Robin the Brave might request the wedge from 17.23◦ to 42◦, and Sir Lancelot the Pure might request the 2◦ wedge from 359◦ to 1◦. Each citizen will be happy if and only if they receive precisely the wedge that they requested. Unfortunately, some of these ranges overlap, so satisfying all the citizens’ requests is simply impossible. Welcome to politics. Describe and analyze an algorithm to find the maximum number of requests that can be satisfied. [Hint: The output of your algorithm should not change if you rotate the table. Do not assume that angles are integers.]

25.

Suppose you are standing in a field surrounded by several large balloons. You want to use your brand new Acme Brand Zap-O-MaticTM to pop all the balloons, without moving from your current location. The Zap-O-MaticTM shoots a high-powered laser beam, which pops all the balloons it hits. Since each shot requires enough energy to power a small country for a year, you want to fire as few shots as possible.

The minimum zap problem can be stated more formally as follows. Given a set C of n circles in the plane, each specified by its radius and the (x, y) coordinates of its center, compute the minimum number of rays from the origin that intersect every circle in C. Your goal is to find an efficient algorithm for this problem.

(a) Suppose it is possible to shoot a ray that does not intersect any balloons. Describe and analyze a greedy algorithm that solves the minimum zap problem in this special case. [Hint: See Exercise 4.]

(b) Describe and analyze a greedy algorithm whose output is within 1 of optimal. That is, if m is the minimum number of rays required to hit every balloon, then your greedy algorithm must output either m or . (Of course, you must prove this fact.)

(c) Describe an algorithm that solves the minimum zap problem in time.

(d) Describe an algorithm that solves the minimum zap problem in time. Assume you have a subroutine that tells you the range of angles of rays that intersects an arbitrary circle c in time. This subroutine is not difficult to write, but it’s not the interesting part of the problem.

[T]he distributions and partitions of knowledge are not like several lines that meet in one angle, and so touch but in a point, but are like branches of a tree that meet in a stem, which hath a dimension and quantity of entireness and continuance before it come to discontinue and break itself into arms and boughs. — Francis Bacon, The Advancement of Learning (1605)

Prev: Dynamic Programming Next: Basic Graph Algorithms