Depth-First Search

Prev: Basic Graph Algorithms Next: Minimum Spanning Trees

Exercises

Depth-first search, topological sort, and strong components

0.

(a) Describe an algorithm to compute the reversal rev(G) of a directed graph in time.

(b) Prove that for every directed graph G, the strong component graph scc(G) is acyclic.

(c) Prove that scc(rev(G)) = rev(scc(G)) for every directed graph G.

(d) Fix an arbitrary directed graph G. For any vertex v of G, let S(v) denote the strong component of G that contains v. For all vertices u and v of G, prove that u can reach v in G if and only if S(u) can reach S(v) in scc(G).

1.

A directed graph G is semi-connected if, for every pair of vertices u and v, either u is reachable from v or v is reachable from u (or both).

(a) Give an example of a directed acyclic graph with a unique source that is not semi-connected.

(b) Describe and analyze an algorithm to determine whether a given directed acyclic graph is semi-connected.

(c) Describe and analyze an algorithm to determine whether an arbitrary directed graph is semi-connected.

2.

The police department in the city of Sham-Poobanana has made every street in the city one-way. Despite widespread complaints from confused motorists, the mayor claims that it is possible to legally drive from any intersection in Sham-Poobanana to any other intersection.

(a) The city needs to either verify or refute the mayor’s claim. Formalize this problem in terms of graphs, and then describe and analyze an algorithm to solve it.

(b) After running your algorithm from part (a), the mayor reluctantly admits that she was lying misinformed. Call an intersection x good if, for any intersection y that one can legally reach from x, it is possible to legally drive from y back to x. Now the mayor claims that over 95% of the intersections in Sham-Poobanana are good. Describe and analyze an efficient algorithm to verify or refute her claim. For full credit, both algorithms should run in linear time.

3.

Suppose we are given a directed acyclic graph G with a unique source s and a unique sink t. A vertex is called an -cut vertex if every path from s to t passes through v, or equivalently, if deleting v makes t unreachable from s. Describe and analyze an algorithm to find every (s, t)-cut vertex in G.

4.

A vertex v in a connected undirected graph G is called a cut vertex if the subgraph G - v (obtained by removing v from G) is disconnected.

(a) Describe a linear-time algorithm that determines, given an undirected graph G and a vertex v, whether v is a cut vertex in G. What is the running time to find all cut vertices by trying your algorithm for each vertex?

(b) Let T be a depth-first spanning tree of an undirected graph G. i. Prove that the root of T is a cut vertex of G if and only if it has more than one child in T. ii. Prove that a non-root vertex v is a cut vertex of G if and only if at least one descendant (in T) of each child of v (in T) is a neighbor (in G) of some proper ancestor of v (in T). [Hint: These claims no longer hold if T not a depth-first spanning tree and/or G is a directed graph.]

(c) Describe an algorithm that identifies every cut vertex in a given undirected graph in time.

5.

An edge e in a connected undirected graph G is called a bridge (or a cut edge) if the subgraph G - e (obtained by removing e from G) is disconnected.

(a) Given G and edge e describe a linear-time algorithm that determines whether e is a bridge or not. What is the running time to find all bridges by trying your algorithm for each edge?

(b) Let T be an arbitrary spanning tree of G. Prove that every bridges of G is also an edge in T. This claim implies that G has at most bridges. How does this information improve your algorithm from part (a) to find all bridges?

(c) Now suppose we root T at an arbitrary vertex r. For any vertex v, let Tv denote the subtree of T rooted at v; for example, Tr = T. Let uv be an arbitrary edge of T, where u is the parent of v. Prove that uv is a bridge of G if and only if uv is the only edge in G with exactly one endpoint in Tv.

(d) Describe a linear-time algorithm to identify every bridge in G. [Hint: Let T be a depth-first spanning tree of G.]

6.

The transitive closure G T of a directed graph G is a directed graph with the same vertices as G, that contains any edge if and only if there is a directed path from u to v in G. A transitive reduction of G is a graph with the smallest possible number of edges whose transitive closure is G T. The same graph may have several transitive reductions.

(a) Describe an efficient algorithm to compute the transitive closure of a given directed graph.

(b) Prove that a directed graph G has a unique transitive reduction if and only if G is acyclic.

(c) Describe an efficient algorithm to compute a transitive reduction of a given directed graph.

7.

One of the oldest algorithms for exploring arbitrary connected graphs was proposed by Gaston Tarry in 1895, as a systematic procedure for solving mazes.7 The input to Tarry’s algorithm is an undirected graph G; however, for ease of presentation, we formally split each undirected edge uv into two directed edges and . (In an actual implementation, this split is trivial; the algorithm simply uses the given adjacency list for G as though G were directed.)

Tarry(G):
unmark all vertices of G
color all edges of G white
s ← any vertex in G
RecTarry(s)
RecTarry(v):
mark v
<<"visit v">>
if there is a white arc v -> w
if w is unmarked
color w -> v green
color v -> w red
<<"traverse v -> w">>
RecTarry(w)
else if there is a green arc v -> w
color v -> w red
<<"traverse v -> w">>
RecTarry(w)

We informally say that Tarry’s algorithm “visits” vertex v every time it marks v, and it “traverses” edge when it colors that edge red and recursively calls RecTarry(w). Unlike our earlier graph traversal algorithm, Tarry’s algorithm can mark same vertex multiple times.

(a) Describe how to implement Tarry’s algorithm so that it runs in time.

(b) Prove that no directed edge in G is traversed more than once.

(c) When the algorithm visits a vertex v for the kth time, exactly how many edges into v are red, and exactly how many edges out of v are red? [Hint: Consider the starting vertex s separately from the other vertices.]

(d) Prove each vertex v is visited at most times, except the starting vertex s, which is visited at most times. This claim immediately implies that Tarry(G) terminates.

(e) Prove that the last vertex visited by Tarry(G) is the starting vertex s.

(f) For every vertex v that Tarry(G) visits, prove that all edges into v and out of v are red when Tarry(G) halts. [Hint: Consider the vertices in the order that they are marked for the first time, starting with s, and prove the claim by induction.]

(g) Prove that Tarry(G) visits every vertex of G. This claim and the previous claim imply that Tarry(G) traverses every edge of G exactly once.

8.

Consider the following variant of Tarry’s graph-traversal algorithm; this variant traverses green edges without recoloring them red and assigns two numerical labels to every vertex:

Tarry2(G):
unmark all vertices of G
color all edges of G white
s ← any vertex in G
RecTarry2(s, 1)
RecTarry2(v, clock):
if v is unmarked
v.pre ← clock; clock ← clock + 1
mark v
if there is a white arc v  -> w
if w is unmarked
color w ->  v green
color v  -> w red
RecTarry2(w, clock)
else if there is a green arc v  -> w
v.post ← clock; clock ← clock + 1
RecTarry2(w, clock)

Prove or disprove the following claim: When Tarry2(G) halts, the green edges define a spanning tree and the labels v.pre and v.post define a preorder and postorder labeling that are all consistent with a single depth-first search of G. In other words, prove or disprove that Tarry2 produces the same output as depth-first search, even though it visits the edges in a completely different order.

9.

You have a collection of n lock-boxes and m gold keys. Each key unlocks at most one box. However, each box might be unlocked by one key, by multiple keys, or by no keys at all. There are only two ways to open each box once it is locked: Unlock it properly (which requires having one matching key in your hand), or smash it to bits with a hammer. Your baby brother, who loves playing with shiny objects, has somehow managed to lock all your keys inside the boxes! Luckily, your home security system recorded everything, so you know exactly which keys (if any) are inside each box. You need to get all the keys back out of the boxes, because they are made of gold. Clearly you have to smash at least one box.

(a) Your baby brother has found the hammer and is eagerly eyeing one of the boxes. Describe and analyze an algorithm to determine if it is possible to retrieve all the keys without smashing any box except the one your brother has chosen.

(b) Describe and analyze an algorithm to compute the minimum number of boxes that must be smashed to retrieve all the keys.

10.

Suppose you are teaching an algorithms course. In your second midterm, you give your students a drawing of a graph and ask then to indicate a breadth-first search tree and a depth-first search tree rooted at a particular vertex. Unfortunately, once you start grading the exam, you realize that the graph you gave the students has several such spanning trees—far too many to list. Instead, you need a way to tell whether each student’s submission is correct! In each of the following problems, suppose you are given a connected graph G, a start vertex s, and a spanning tree T of G.

(a) Suppose G is undirected. Describe and analyze an algorithm to decide whether T is a depth-first spanning tree rooted at s.

(b) Suppose G is undirected. Describe and analyze an algorithm to decide whether T is a breadth-first spanning tree rooted at s. [Hint: It’s not enough for T to be an unweighted shortest-path tree. Yes, this is the right chapter for this problem!]

(c) Suppose G is directed. Describe and analyze an algorithm to decide whether T is a breadth-first spanning tree rooted at s. [Hint: Solve part (b) first.]

(d) Suppose G is directed. Describe and analyze an algorithm to decide whether T is a depth-first spanning tree rooted at s.

11.

Several modern programming languages, including JavaScript, Python, Perl, and Ruby, include a feature called parallel assignment, which allows multiple assignment operations to be encoded in a single line of code. For example, the Python code x,y = 0,1 simultaneously sets x to 0 and y to 1. The values of the right-hand side of the assignment are all determined by the old values of the variables. Thus, the Python code a,b = b,a swaps the values of a and b, and the following Python code computes the nth Fibonacci number:

def fib(n):
prev, curr = 1, 0
while n > 0:
prev, curr, n = curr, prev+curr, n-1
return curr

Suppose the interpreter you are writing needs to convert every parallel assignment into an equivalent sequence of individual assignments. For example, the parallel assignment a,b = 0,1 can be serialized in either order, but the parallel assignment x,y = x + 1, x + y can only be serialized as y = x + y; x = x + 1. Serialization may require one or more additional temporary variables; for example, serializing a,b = b,a requires one temporary variable, and serializing x,y = x + y, x - y requires two temporary variables.

(a) Describe an algorithm to determine whether a given parallel assignment can be serialized without additional temporary variables.

(b) Describe an algorithm to determine whether a given parallel assignment can be serialized with exactly one additional temporary variable. Assume that the given parallel assignment involves only simple integer variables (no indirection via pointers or arrays); no variable appears on the left side more than once; and expressions on the right side have no side effects. Don’t worry about the details of parsing the assignment statement; just assume (but describe!) an appropriate graph representation.

Dynamic Programming

12.

Suppose we are given a directed acyclic graph G whose nodes represent jobs and whose edges represent precedence constraints; that is, each edge indicates that job must be completed before job begins. Each node also has a weight indicating the time required to execute job .

(a) Describe an algorithm to determine the shortest interval of time in which all jobs in G can be executed.

(b) Suppose the first job starts at time 0. Describe an algorithm to determine, for each vertex v, the earliest time when job v can begin.

(c) Now describe an algorithm to determine, for each vertex v, the latest time when job v can begin without violating the precedence constraints or increasing the overall completion time (computed in part (a)), assuming that every job except v starts at its earliest start time (computed in part (b)).

13.

Let G be a directed acyclic graph with a unique source s and a unique sink t.

(a) A Hamiltonian path in G is a directed path in G that contains every vertex in G. Describe an algorithm to determine whether G has a Hamiltonian path.

(b) Suppose the vertices of G have weights. Describe an efficient algorithm to find the path from s to t with maximum total weight.

(c) Suppose we are also given an integer . Describe an efficient algorithm to find the maximum-weight path from s to t that contains at most edges. (Assume there is at least one such path.)

(d) Suppose some of the vertices of G are marked as important, and we are also given an integer k. Describe an efficient algorithm to find the maximum-weight path from s to t that visits at least k important vertices. (Assume there is at least one such path.)

(e) Describe an algorithm to compute the number of paths from s to t in G. (Assume that you can add arbitrarily large integers in time.)

14.

Let G be a directed acyclic graph whose vertices have labels from some fixed alphabet, and let be a string over the same alphabet. Any directed path in G has a label, which is a string obtained by concatenating the labels of its vertices.

(a) Describe an algorithm that either finds a path in G whose label is A or correctly reports that there is no such path.

(b) Describe an algorithm to find the number of paths in G whose label is A. (Assume that you can add arbitrarily large integers in time.)

(c) Describe an algorithm to find the longest path in G whose label is a subsequence of A.

(d) Describe an algorithm to find the shortest path in G whose label is a supersequence of A.

(e) Describe an algorithm to find a path in G whose label has minimum edit distance from A.

15.

A polygonal path is a sequence of line segments joined end-to-end; the endpoints of these line segments are called the vertices of the path. The length of a polygonal path is the sum of the lengths of its segments. A polygonal path with vertices (x 1, y1), (x 2, y2),…, (x k, yk) is monotonically increasing if x and +1 for every index i—informally, each vertex of the path is above and to the right of its predecessor.

Suppose you are given a set S of n points in the plane, represented as two arrays and . Describe and analyze an algorithm to compute the length of the longest monotonically increasing path with vertices in S. Assume you have a subroutine Length(x, y, x 0, y 0) that returns the length of the segment from (x, y) to (x 0, y 0).

16.

For any two nodes u and w in a directed acyclic graph G, the interval is the ∪ of all directed paths in G from u to v. Equivalently, consists of all vertices v such that v in reach(u) and w in reach(x), together with all the edges in G connecting those vertices. Suppose we are given a directed acyclic graph G, in which every vertex has a numerical weight, which may be positive, negative, or zero.

(a) Describe an efficient algorithm to find the maximum-weight interval in G, where the weight of each interval is the sum of the weights of its vertices.

(b) Describe an efficient algorithm to find the largest vertex weight in every interval in G. Your algorithm should compute a two-dimensional array where each entry is the maximum weight among all vertices in the interval . In particular, if is empty, then should be -∞.

17.

Let G be a directed acyclic graph whose vertices have labels from some fixed alphabet. Any directed path in G has a label, which is a string obtained by concatenating the labels of its vertices. Recall that a palindrome is a string that is equal to its reversal.

(a) Describe and analyze an algorithm to find the length of the longest palindrome that is the label of a path in G. For example, given the graph in Figure 6.23, your algorithm should return the integer 6, which is the length of the palindrome HANNAH. A H S

N E

N O

O T

N D

T H

A

(b) Describe an algorithm to find the longest palindrome that is a subsequence of the label of a path in G.

(c) Suppose G has a single source s and a single sink t. Describe an algorithm to find the shortest palindrome that is a supersequence of the label of a path in G from s to t.

18.

Suppose you are given two directed acyclic graphs G and H in which every node has a label from some finite alphabet; different nodes may have the same label. The label of a path in either dag is the string obtained by concatenating the labels of its vertices.

(a) Describe and analyze an algorithm to compute the length of the longest string that is both the label of a path in G and the label of a path in H.

(b) Describe and analyze an algorithm to compute the length of the longest string that is both a subsequence of the label of a path in G and a subsequence of the label of a path in H.

(c) Describe and analyze an algorithm to compute the length of the shortest string that is both a supersequence of the label of a path in G and a supersequence of the label of a path in H. [Hint: This is easier than it looks.]

19.

Let G be an arbitrary (not necessarily acyclic) directed graph in which every vertex v has an integer weight w(v).

(a) Describe an algorithm to find the longest directed path in G whose vertex weights define an increasing sequence.

(b) Describe and analyze an algorithm to determine the maximum-weight vertex reachable from each vertex in G. That is, for each vertex v, your algorithm needs to compute maxreach(v):= max{w(x) | x in reach(v)}.

20.

(a) Suppose you are given a directed acyclic graph G with n vertices and an integer . Describe an efficient algorithm to find a set of at most k vertex-disjoint paths that visit every vertex in G.

(b) Now suppose the edges of the input dag G have weights, which may be positive, negative, or zero. Describe an efficient algorithm to find a set of at most k vertex-disjoint paths with minimum total weight that visit every vertex in G. Your algorithms should run in time for some small constant c. A single vertex is a path with weight zero. (We will see a more efficient algorithm for part (a) in Chapter 11.)

21.

Kris is a professional rock climber who is competing in the U.S. climbing nationals. The competition requires Kris to use as many holds on the climbing wall as possible, using only transitions that have been explicitly allowed by the route-setter. The climbing wall has n holds. Kris is given a list of m pairs (x, y) of holds, each indicating that moving directly from hold x to hold y is allowed; however, moving directly from y to x is not allowed unless the list also includes the pair (y, x). Kris needs to figure out a sequence of allowed transitions that uses as many holds as possible, since each new hold increases his score by one point. The rules allow Kris to choose the first and last hold in his climbing route. The rules also allow him to use each hold as many times as he likes; however, only the first use of each hold increases Kris’s score.

(a) Define the natural graph representing the input. Describe and analyze an algorithm to solve Kris’s climbing problem if you are guaranteed that the input graph is a dag.

(b) Describe and analyze an algorithm to solve Kris’s climbing problem with no restrictions on the input graph. Both of your algorithms should output the maximum possible score that Kris can earn.

22.

There are n galaxies connected by m intergalactic teleport-ways. Each teleport-way joins two galaxies and can be traversed in both directions. However, the company that runs the teleport-ways has established an extremely lucrative cost structure: Anyone can teleport further from their home galaxy at no cost whatsoever, but teleporting toward their home galaxy is prohibitively expensive. Judy has decided to take a sabbatical tour of the universe by visiting as many galaxies as possible, starting at her home galaxy. To save on travel expenses, she wants to teleport away from her home galaxy at every step, except for the very last teleport home.

(a) Describe and analyze an algorithm to compute the maximum number of galaxies that Judy can visit. Your input consists of an undirected graph G with n vertices and m edges describing the teleport-way network, an integer identifying Judy’s home galaxy, and an array containing the distances of each galaxy from s.

(b) Just before embarking on her universal tour, Judy wins the space lottery, giving her just enough money to afford two teleports toward her home galaxy. Describe a new algorithm to compute the maximum number of distinct galaxies Judy can visit. She can visit the same galaxy more than once, but crucially, only the first visit counts toward her total.

23.

The Doctor and River Song decide to play a game on a directed acyclic graph G, which has one source s and one sink t.8 Each player has a token on one of the vertices of G. At the start of the game, The Doctor’s token is on the source vertex s, and River’s token is on the sink vertex t. The players alternate turns, with The Doctor moving first. On each of his turns, the Doctor moves his token forward along a directed edge; on each of her turns, River moves her token backward along a directed edge. If the two tokens ever meet on the same vertex, River wins the game. (“Hello, Sweetie!“) If the Doctor’s token reaches t or River’s token reaches s before the two tokens meet, then the Doctor wins the game. Describe and analyze an algorithm to determine who wins this game, assuming both players play perfectly. That is, if the Doctor can win no matter how River moves, then your algorithm should output “Doctor”, and if River can win no matter how the Doctor moves, your algorithm should output “River”. (Why are these the only two possibilities?) The input to your algorithm is the graph G.

24.

Let x = x 1 x 2… x n be a given n-character string over some finite alphabet Σ, and let A be a deterministic finite-state machine with m states over the same alphabet.

(a) Describe and analyze an algorithm to compute the length of the longest subsequence of x that is accepted by A. For example, if A accepts the language (AR)∗ and x = ABRACADABRA, your algorithm should output the number 4, which is the length of the subsequence ARAR.

(b) Describe and analyze an algorithm to compute the length of the shortest supersequence of x that is accepted by A. For example, if A accepts the language (ABCDR)∗ and x = ABRACADABRA, your algorithm should output the number 25, which is the length of the supersequence ABCDRABCDRABCDRABCDRABCDR. Analyze your algorithms in terms of the length n of the input string, the number m of states in the finite-state machine, and the size of the alphabet Σ.

25.

Not every dynamic programming algorithm can be modeled as finding an optimal path through a directed acyclic graph, but every dynamic programming algorithm does process some underlying dependency graph in postorder. The labels s and t are abbreviations for the Untempered Schism and the Time Vortex, or the Shining World of the Seven Systems (also known as Gallifrey) and Trenzalore, or Skaro and Telos, or Something else Timey-wimey. It’s all very complicated, never mind.

(a) Suppose we are given a directed acyclic graph G where every node stores a numerical search key. Describe and analyze an algorithm to find the largest binary search tree that is a subgraph of G.

(b) Suppose we are given a directed acyclic graph G and two vertices s and t. Describe an algorithm to compute the number of directed paths in G from s to t. (Assume that any arithmetic operation requires time.)

(c) Let G be a directed acyclic graph with the following features: • G has a single source s and several sinks t 1, t 2,…, t k. • Each edge has an associated weight p() between 0 and 1. • For each non-sink vertex v, the total weight of all edges leaving v P is 1; that is, w p() = 1.

The weights p() define a random walk in G from the source s to some sink t i; after reaching any non-sink vertex v, the walk follows edge with probability p(). All probabilities are mutually independent. Describe and analyze an algorithm to compute the probability that this random walk reaches sink t i, for every index i. (Assume that each arithmetic operation takes only time.)

Prev: Basic Graph Algorithms Next: Minimum Spanning Trees