Parallel Computing
Prev: Graphs Next: System Design
Problems
20.1
The following program is part of an online spell-correction service. Clients send as input a string, and the service returns an array of strings in its dictionary that are closest to the input string. The service caches results to improve performance.
Critique the implementation and provide a solution that overcomes its limitations.
Hint: Look for races, and lock as little as possible to avoid reducing throughput.
Variant: Threads to execute a method called critical(). Before this, they execute a method called rendezvous(). The synchronization constraint is that only one thread can execute critical() at a time, and all threads must have completed executing rendezvous() before critical() can be called. You may assume is stored in a variable that is accessible from all threads. Design a synchronization mechanism for the threads. All threads must execute the same code. Threads may call critical() multiple times, and you should ensure that a thread cannot call critical() a th time until all other threads have completed their th calls to critical().
20.2
Threads and each increment an integer variable times. The program prints the final value, which is nondeterministic because the increment is unsynchronized.
What are the maximum and minimum values that could be printed by the program as a function of ?
Hint: Be as perverse as you can when scheduling the threads.
20.3
Thread prints odd numbers from to , and thread prints even numbers from to .
Write code in which the two threads, running concurrently, print the numbers from to in order.
Hint: The two threads need to notify each other when they are done.
20.4
The following program implements part of a simple HTTP server. Suppose you find that the program has poor performance because it frequently blocks on I/O.
What steps could you take to improve the program’s performance? Feel free to use any utilities from the standard library, including concurrency classes.
Hint: Use multithreading, but control the number of threads.
20.5
When threads need to acquire multiple locks to enter a critical section, deadlock can result. As an example, suppose both and need to acquire locks and . If first acquires , and then acquires , they may end up waiting on each other forever.
Identify a concurrency bug in the program below, and modify the code to resolve the issue.
20.6
Consider an object which is read from and written to by many threads. You need to ensure that no thread may access for reading or writing while another thread is writing to . Two or more readers may access at the same time.
Implement a synchronization mechanism for the first readers-writers problem, in which no reader is kept waiting if is currently opened for reading.
Hint: Track the number of readers.
20.7
Suppose we have an object as in Problem 20.6. In the solution to the first readers-writers problem, a reader that already has access can cause a waiting writer to starve if more readers keep arriving.
Implement a synchronization mechanism for the second readers-writers problem, which gives write preference, i.e., no writer, once added to the queue, is kept waiting longer than absolutely necessary.
Hint: Force readers to acquire a write lock.
Variant: The specifications to Problems 20.6 and 20.7 allow starvation. The first may starve writers, the second may starve readers. The third readers-writers problem adds the constraint that neither readers nor writers should starve. Implement a synchronization mechanism for that version.
20.8
Consider a web-based calendar in which the server hosting the calendar has to perform a task when the next calendar event takes place. The task could be sending email or an SMS.
Develop a timer class that manages the execution of deferred tasks. The timer constructor takes an object which includes a run method and a string-valued name field. The class must support:
- starting a thread, identified by name, at a given time in the future
- canceling a thread, identified by name, where the cancel request is ignored if the thread has already started
Hint: There are two aspects: data structure design and concurrency.
20.9
In Problem 13.13 and its solution we introduced the Collatz conjecture and heuristics for checking it. Build a parallel checker for the Collatz conjecture.
Specifically, assume your program runs on a multicore machine, and its threads are distributed across the available cores. Your program should check the Collatz conjecture for every integer in , where is an input. To keep the program from overloading the system, you should not have more than threads running at a time.
Hint: Use multithreading for performance, but minimize threading overhead.
Answers
20.1
The bug is a race on the cache state: checking, reading, evicting, and updating cached entries are not atomic. Locking the entire service is correct but too coarse. The right fix is to lock only the cache access and cache update sections, allowing the expensive dictionary computation to proceed outside the lock. The variant is a barrier-plus-critical-section problem and is naturally solved with condition variables and counters or generation numbers.
20.2
The maximum is , achieved when one thread runs to completion and then the other. The minimum is when , and for all , by scheduling the reads and writes so that almost all increments are overwritten by stale writes from the other thread.
20.3
Use a shared turn variable protected by a mutex and a condition variable. The odd thread waits until it is the odd turn, prints, flips the turn, and signals; the even thread does the symmetric operation. This avoids busy waiting and guarantees the output order.
20.4
Do not spawn an unbounded number of request threads. Instead, keep a bounded thread pool and place accepted sockets into a blocking work queue. Worker threads repeatedly remove requests from the queue and process them. This overlaps I/O and computation while capping resource usage.
20.5
The code can deadlock when concurrent transfers lock the same pair of accounts in opposite orders. Fix it by imposing a global order on lock acquisition, e.g., by account id, and always locking the two account mutexes in that order. This preserves concurrency between unrelated transfers without a global lock.
20.6
Keep a reader count plus separate synchronization for readers and writers. Readers increment the count before reading and decrement it afterward; writers wait until the reader count is zero before proceeding. The standard implementation uses a read lock protecting the count and a write lock protecting writers, with wait-notify to avoid busy waiting.
20.7
Give writers preference by making incoming readers pass through a gate controlled by the writer side. Once a writer is queued, later readers cannot slip ahead of it. One implementation is to have readers briefly acquire and release the write lock before joining the read side, ensuring a waiting writer stays ahead of subsequently arriving readers.
20.8
Use two coordinated data structures: a min-heap keyed by scheduled run time and a hash table from task name to heap entry. A dispatcher thread sleeps until the earliest task is due, wakes to launch it, and adjusts when insertions or cancellations change the heap minimum. Protect the shared structures with a mutex.
20.9
Do not assign one integer per thread. The work per number is too small, so per-thread overhead dominates. Instead, partition into moderately sized ranges, place those ranges into a synchronized work queue, and let a bounded set of worker threads repeatedly pull the next range. This balances load while keeping the thread count fixed and the scheduling overhead low.