Prev: semaphores Next: event-based-concurrency-advanced
There are lots of concurrency bugs. A study by Lu et. al (2008) analyzed bugs in MySQL, Apache, Mozilla, and OpenOffice. Most bugs were deadlock, and the remaining were deadlock bugs.
Non-Deadlock bugs can be characterized as Atomicity-Violation bugs, or Order Violation bugs.
Atomicity bugs look like this:
1::
Thread if (thd->proc_info) {
(thd->proc_info, ...);
fputs}
2::
Thread ->proc_info = NULL; thd
First, thread 1 checks to see if thd->proc_info
(a
file to print to) is not NULL
. Then, it writes to it.
Thread 2 sets thd->proc_info
to
NULL
.
If thread 1 is interrupted after the if conditional but before the
fputs
, thread 2 will set it to NULL
and then
cause a crash.
This code assumes that the call to thd->proc_info
and
fputs
are atomic (in a transaction). To fix the code, we do
exactly that:
= PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t proc_info_lock
1::
Thread (&proc_info_lock);
pthread_mutex_lockif (thd->proc_info) {
(thd->proc_info, ...);
fputs}
(&proc_info_lock);
pthread_mutex_unlock
2::
Thread (&proc_info_lock);
pthread_mutex_lock->proc_info = NULL;
thd(&proc_info_lock) pthread_mutex_unlock
Order Violation bugs occur if the code requires A to be before B, but
sometimes B can be run before A. In this case, mMain
needs
to be initialized, but it is not always. This can be fixed with a
condvar.
1::
Thread void init() {
= PR_CreateThread(mMain, ...);
mThread }
2::
Thread void mMain(...) {
= mThread->State;
mState }
= PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mtLock = PTHREAD_COND_INITIALIZER;
pthread_cond_t mtCond int mtInit = 0;
1 ::void init() {
Thread ... mThread = PR_CreateThread(mMain, ...);
// signal that the thread has been created...
(&mtLock);
pthread_mutex_lock= 1;
mtInit (&mtCond);
pthread_cond_signal(&mtLock);
pthread_mutex_unlock...
}
2 ::void mMain(...) {
Thread ...
// wait for the thread to be initialized...
(&mtLock);
pthread_mutex_lockwhile (mtInit == 0)
(&mtCond, &mtLock);
pthread_cond_wait(&mtLock);
pthread_mutex_unlock
= mThread->State;
mState ...
}
Most bugs were either atomicity or order violations (97%). Using Mutexes, semaphores, and CondVars properly prevents most of these bugs from happening.
Deadlock occurs if there is a cycle in the calling graph which acquires locks.
Four conditions are required for deadlock:
Breaking circular wait is the most common mitigation strategy – if you write your locks in a way that acquiring them is totally ordered, there is no circular wait, and no deadlock.
In more complex systems, a partial ordering is good enough.
One simple way of doing this is to use the addresses of mutexes for locking order.
void do_work(mutex *m1, mutex *m2) {
if (m1 > m2) {
->lock();
m1->lock();
m2} else {
->lock();
m2->lock();
m1}
}
Hold and wait involves using a global lock that guards all of the other locks:
->lock();
prevention->lock();
lock_1->lock();
lock_2->unlock(); prevention
This approach is fairly inefficient and breaks encapsulation, so it is not as popular as applying an ordering to locks.
We could use preemption, where a thread tries to grab a lock if it can, and otherwise tries again.
:
top(L1);
pthread_mutex_lockif (pthread_mutex_trylock(L2) != 0) {
(L1);
pthread_mutex_unlockgoto top;
}
This leads however to livelock
, where two threads could
try to grab both locks and keep failing, leading not to deadlock, but a
state where two threads cannot make progress. This is also inefficient
and less common.
Herlihy noted that one could design data structures without locks at
all, making lock-free
and wait-free
data
structures.
All this requires is to have a Compare and Swap routine, which we can use to make other routines, like increment.
List insertion would look like this, and be
lock-free
:
void insert(int value) {
*n = malloc(sizeof(node_t));
node_t (n != NULL);
assert->value = value;
ndo {
->next = head;
n} while (CompareAndSwap(&head, n->next, n) == 0);
}
Assume that there are four threads grabbing two locks. T1 and T2 grab L1 and L2, and T3 only grabs L2.
T1 | T2 | T3 | T4 | |
---|---|---|---|---|
L1 | yes | yes | no | no |
L2 | yes | yes | yes | no |
A scheduler might figure out that as long as T1 and T2 are not scheduled on the same CPU, deadlock cannot occur, and thus would not schedule those processes.
This requires static knowledge, which is useful in some systems but not widely applicable.
Just Restart your computer when deadlock occurs.
Prev: semaphores Next: event-based-concurrency-advanced