pcp
[Top] [All Lists]

pmServiceDiscoveryInterrupt() commit a8b87e2 et al.

To: brolley@xxxxxxxxxx
Subject: pmServiceDiscoveryInterrupt() commit a8b87e2 et al.
From: fche@xxxxxxxxxx (Frank Ch. Eigler)
Date: Thu, 19 Jun 2014 15:44:43 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
Hi, Dave -

I have a couple of concerns about this commit and its friends.

By making the multithreaded discovery context global (static), the
discovery function is no longer safe to call from a multithreaded
application.  (Same by the way with the globalOptionsInfo.)  The "no
unsafe side-effects" and "single-threaded scope" check-statics
observations are mistaken, as indeed two concurrent threads calling
pmDiscoverServices() will overwrite and corrupt each other.

I see why this approach seemed right, considering your idea for the
pmDiscoveryInterrupt() function, which has no way to identify the
operation in progress.  When I briefly considered this problem, I
thought perhaps you could give (a new version of) pmDiscoverServices
an sig_atomic_t* parameter, which the various threads would
periodically dereference, and which another thread or signal handler
of the embedding application could set.  Interrupting then would not
need to be a function call, just a change of that pointed-to value
(almost exactly the "interrupted" flag from your new pmfind.c).  If
you adopt this, you could make the discovery-context back into a
stack-resident thread-safe object.  For that matter, all the statics
in src/probe.c should be stack/context-resident for the same reason.

Regarding __pmAvahiServiceDiscoveryInterrupt() being a no-op, that is
probably not necessarily a good idea.  We can/will get much longer
timeouts than 0.5 seconds, because incomplete results have been
observed in the wild already.  If we are to replace the function-based
interruptor with a flag-based one, perhaps it can be made to work by
having avahi.c's avahi_simple_poll_loop() be replaced with a looped
call to avahi_simple_poll_iterate() with a short timeout, aborting if
the interrupt flag was set.

Also we need to keep in mind a general concern about changing the API
of a function (adding the globalOptions parameter) that's already in
the ABI (pmDiscoverServices): it cannot be done as simply as per
commit #1380d1bb.  Old sources won't compile, and worse, old binaries
against the new library will crash.  The function would need a new
name (or else wily tricks with symbol-versioning/aliasing would be
called for, but IMHO let's not go there).  

If this "globalOptions" field is truly necessary, then it'll have to
be passed to the renamed function only.  Have you considered simply
parsing global options out of a single mechanism string?  Or just
replicating the "global" parsing logic in the two mechanisms we
currently have?  Though OTOH if you were to add that interrupt flag I
propose, then there's API/ABI breakage anyway.

Re. the resolution logic and thread stack size connection, could the
probe.c main thread do the resolution in a loop at the end, rather
than invoking the resolver and whatnot from within each thread?  It
would slow things down, sure, but we wouldn't have to rely on hope
that a twice-minimum stack is always going to be enough.  (At least
we're fortunate that getnameinfo(), the heavy lifter behind
__pmGetNameInfo(), is supposed to be thread-safe as per POSIX docs.
We can't assume such things without checking.)


- FChE

<Prev in Thread] Current Thread [Next in Thread>