pcp
[Top] [All Lists]

Re: [Bug 1104] New: signal delivery may lead to deadlock

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: [Bug 1104] New: signal delivery may lead to deadlock
From: fche@xxxxxxxxxx (Frank Ch. Eigler)
Date: Mon, 16 Feb 2015 09:29:39 -0500
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <54E1116B.90003@xxxxxxxxxxxxxxxx> (Ken McDonell's message of "Mon, 16 Feb 2015 08:36:43 +1100")
References: <bug-1104-835@xxxxxxxxxxxxxxxx/bugzilla/> <y0miof455g0.fsf@xxxxxxxx> <54E1116B.90003@xxxxxxxxxxxxxxxx>
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)
Ken McDonell <kenj@xxxxxxxxxxxxxxxx> writes:

> [...]
>> Yeah, it's a textbook example.  I was thinking at one point that the
>> AF* functionality could be recast as something occurring synchronously
>> rather than asynchronously.  For example, we could define redefine the
>> __pmAF* functions so that callbacks occur only as/after libpcp
>> functions are about to return to the application.  (A new __pmAFwait()
>> could serve pmlogger's non-busy-looping.)
>
> Maybe.  But pmlogger is sitting in select waiting for an AF timer
> event _or_ i/o on the control port (via pmlc) ... not sure how a
> __pmAFwait() could be helpful here.

Right; alternatives would be needed for integration into fd-select-ish
event loops.  The thought was that if a relatively-unrestricted
callback functionality is valuable to multiple areas of pcp, then a
beefed-up AF facility would be a reasonable place to provide that.


> Anyway, for the specific case of pmlogger, I've fixed this is the
> proper POSIX.1 way, and made all of the AF callback routines
> restricted to be async-signal-safe (indeed they call _no_ other
> routines).  All of the work is deferred to the main loop once the
> global state change(s) are noticed.

Yes, that roughly works - using the pmAF machinery only to
store/recall a data value (the afid parameter).  It doesn't solve the
posix-signal-unsafe problems -within- the AF code, such as the memory
allocation and fprintf stuff in onalarm().


> [...]
> perl/PMDA/local.c - this is an issue, see bug#1069
> pmdas/bash/bash.c - safe
> pmdas/linux_proc/proc_pid.c - would need a libpcp_pmda change to allow
> some safe but async hook from pmdaMain() back into the pmda code, or
> use its own main loop, like some other pmdas do (e.g. logger, trace)
> pmdas/hotproc/src/hotproc.c - dead code
> pmdas/logger/logger.c - safe
> pmdas/papi/papi.c - same as linux_proc
> pmdas/trace/src/trace.c and pmdas/trace/src/comms.c - has own main
> loop, could be fixed easily
> pmlogger/... - all done

Thank you for the analysis.


- FChE

<Prev in Thread] Current Thread [Next in Thread>