Just so we forget, there are a number of race conditions and
similar problems in the multi-threading support in libpcp.
Since we have no known multi-threaded clients other than a few
pcpqa synethetic tests, these are not high priority problems.
They are listed here just to avoid being forgotten.
http://oss.sgi.com/archives/pcp/2014-03/msg00124.html
- access.c, getClientIds(), a mismatched PM_UNLOCK in the loop around
line 1295. (By the way, the 'myhostid' variable is probably a
symptom of more inappropriate FQDN assumptions that will need to be
fixed.)
- context.c, pmNewContext(), contexts*, old_*context* used unprotected
in FAILED: path.
- context.c, pmDupContext(), quite possibly unsafe if another thread
is manipulating the oldcon at the same time; by the way, is that
condition (same pmUseContext by different threads) detected /
forbidden / permitted? The __pmHandleToPtr() ctxp->c_lock design
sounds fine, but it's not used in all context-structure users
- context.c __pmHandleToPtr race condition between the libpcp unlock
and the context lock (what if another thread deletes the context)
during this time?) (Flipping the locks around could maybe cause
deadlocks OTOH; __pmLogFetchInterp nests the libpcp lock within the
ctxp lock already.)
- interp.c cache_read: race condition between PM_UNLOCK and use of
lfup pointer; another thread may have nuked that cache[] slot in the
mean time; this general pattern recurs several places in the code
(look for UNLOCK followed by a return FOO->BAR, e.g. logmeta.c)
- util.c: using libpcp lock for all kinds of printing, possibly
nesting within context locks -> possible deadlock
- loop.c: no locking at all