Hi -
The next installment from pcpfans.git fche/multithreading.
commit 7a5b2f9963e050f9aaa374d06a7f5c8d600bc0fa
Author: Frank Ch. Eigler <fche@xxxxxxxxxx>
Date: Sun Apr 24 15:17:00 2016 -0400
PR1055: handle some multithreaded deadlocks & race conditions
While running the qa/4751 test case at full scale, deadlocks reliably
occur. (In fact, the 4751.out file was initially checked in truncated
due to an alarm() catching the deadlocked run, producing no output.)
The same type of deadlock is also easily demonstrated on stock
previous-version libpcp, so it exculpates the recent pmNewContext
multithreading changes.
The valgrind "helgrind" tool is good at identifying problems of this
nature, and should be routinely used for verifying code that deals
with PM_*LOCK.
The gist of one problem is inconsistent lock ordering. The libpcp
lock is sometimes taken nested within a context c_lock; and sometimes
vice versa. Two threads can easily lock each other out. helgrind
showed multiple different scenarios where the libpcp lock was taken
unnecessarily by lower level code - where a smaller lock was
sufficient. This patchset adds a handful of small, non-recursive
locks for these.
This patch also includes a fix to a nastier race condition in
__pmHandleToPtr(), whereby a context-destruction could race against
context-structure lookup. Some work remains in the multi-archive code
and elsewhere to avoid two mildly racy functions (__pmPtrToHandle and
the new __pmHandleToPtr_unlocked).
qa/4751 and all other prexisting thread-group test cases look good
now, no more deadlocks or lock-ordering-error reports there at least.
(There are likely more hiding in the code: the libpcp lock is way
overused.)
commit 3f4115d95778e4594361dea8cfaa5caff6d81086
Author: Frank Ch. Eigler <fche@xxxxxxxxxx>
Date: Sun Apr 24 14:55:25 2016 -0400
multithreading qa/4751
Tweak the qa/4751 test case so that different unreachable-host type
error codes are mapped to a uniform one. Generate an actual proper
output for the last test (the one with some 156 contexts/threads).
|