pcp
[Top] [All Lists]

Re: pmServiceDiscoveryInterrupt() commit a8b87e2 et al.

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: pmServiceDiscoveryInterrupt() commit a8b87e2 et al.
From: Dave Brolley <brolley@xxxxxxxxxx>
Date: Wed, 25 Jun 2014 14:04:23 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <20140619194444.3B03D58015@xxxxxxxx>
References: <20140619194444.3B03D58015@xxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
On 06/19/2014 03:44 PM, Frank Ch. Eigler wrote:
Hi, Dave -

I have a couple of concerns about this commit and its friends.
Thanks for the review! Although I have addressed each item with the relevant commits, it would probably be best to review the code after merging all of them, since some aspects of the API evolved as the work items progressed. Also, man pages will be updated once we have agreement on the API.
By making the multithreaded discovery context global (static), the
discovery function is no longer safe to call from a multithreaded
application.  (Same by the way with the globalOptionsInfo.)  The "no
unsafe side-effects" and "single-threaded scope" check-statics
observations are mistaken, as indeed two concurrent threads calling
pmDiscoverServices() will overwrite and corrupt each other.
I have eliminated the static variables from the discovery code of bot the avahi and probe mechanisms. See these commits:

commit 5b37c891470533dae22c2cf5a379a3d6c499e3cf
Author: Dave Brolley <brolley@xxxxxxxxxx>
Date:   Tue Jun 24 17:06:55 2014 -0400

Eliminate statics from the code for service discovyer via active probing.

    New connectionOptions now on the stack and passed by address to the
    work functions.

commit cae766bb3a39dec02af1b83f916e87c12580c8cc
Author: Dave Brolley <brolley@xxxxxxxxxx>
Date:   Tue Jun 24 16:26:10 2014 -0400

    Eliminate static globalOptionsInfo for service discovery.

    Now on the stack and is passed down (by address) to the
    mechanism-specific APIs. The caller-supplied pointer to the
    interrupted flag is now also kept here.


I see why this approach seemed right, considering your idea for the
pmDiscoveryInterrupt() function, which has no way to identify the
operation in progress.  When I briefly considered this problem, I
thought perhaps you could give (a new version of) pmDiscoverServices
an sig_atomic_t* parameter, which the various threads would
periodically dereference, and which another thread or signal handler
of the embedding application could set.  Interrupting then would not
need to be a function call, just a change of that pointed-to value
(almost exactly the "interrupted" flag from your new pmfind.c).  If
you adopt this, you could make the discovery-context back into a
stack-resident thread-safe object.  For that matter, all the statics
in src/probe.c should be stack/context-resident for the same reason.
Also done. See:

commit 0b14abd3b59e35804efdecf791374f273dfb543d
Author: Dave Brolley <brolley@xxxxxxxxxx>
Date:   Tue Jun 24 15:36:12 2014 -0400

    New pmDiscoverServicesAdvanced(1) API function.

    Adds a global options string and a pointer to an interrupt flag.
    All callers updated.
Regarding __pmAvahiServiceDiscoveryInterrupt() being a no-op, that is
probably not necessarily a good idea.  We can/will get much longer
timeouts than 0.5 seconds, because incomplete results have been
observed in the wild already.  If we are to replace the function-based
interruptor with a flag-based one, perhaps it can be made to work by
having avahi.c's avahi_simple_poll_loop() be replaced with a looped
call to avahi_simple_poll_iterate() with a short timeout, aborting if
the interrupt flag was set.
This worked out exactly as you proposed. See:

commit 48b9b0574f609f4cd167bb40bcf6eb724cff45c8
Author: Dave Brolley <brolley@xxxxxxxxxx>
Date:   Wed Jun 25 11:26:47 2014 -0400

    Implement interruptable avahi service discovery.

    Call avahi_simple_poll_iterate() in a loop and check for
    interruption on each iteration.
Also we need to keep in mind a general concern about changing the API
of a function (adding the globalOptions parameter) that's already in
the ABI (pmDiscoverServices): it cannot be done as simply as per
commit #1380d1bb.  Old sources won't compile, and worse, old binaries
against the new library will crash.  The function would need a new
name (or else wily tricks with symbol-versioning/aliasing would be
called for, but IMHO let's not go there).

If this "globalOptions" field is truly necessary, then it'll have to
be passed to the renamed function only.  Have you considered simply
parsing global options out of a single mechanism string?  Or just
replicating the "global" parsing logic in the two mechanisms we
currently have?  Though OTOH if you were to add that interrupt flag I
propose, then there's API/ABI breakage anyway.
There is now a new expanded API, pmDiscoverServicesAdvanced(1) -- name is not set in stone should anyone be able to think of a better one. See commit 0b14abd3b59e35804efdecf791374f273dfb543d above.

Re. the resolution logic and thread stack size connection, could the
probe.c main thread do the resolution in a loop at the end, rather
than invoking the resolver and whatnot from within each thread?  It
would slow things down, sure, but we wouldn't have to rely on hope
that a twice-minimum stack is always going to be enough.  (At least
we're fortunate that getnameinfo(), the heavy lifter behind
__pmGetNameInfo(), is supposed to be thread-safe as per POSIX docs.
We can't assume such things without checking.)
I looked into this and have left it alone for the moment. I would like to revisit the need for this once I have worked on the balance between the number of threads and the connection timeout (i.e. the performance discussion we had on IRC). No matter what, we will always have to be aware of the amount of stack needed by the worker threads, but if we arrive at a reasonable number in the end, then this may become less of a concern. For now, we use the larger stack size only when resolving the addresses. See:

commit e77cc955bb755bb9f36727f737a701d081784377
Author: Dave Brolley <brolley@xxxxxxxxxx>
Date:   Wed Jun 25 11:49:30 2014 -0400

    Use the minimum required stack for service discovery worker threads.

    Tailor the stack size to the given discovery options.

Dave

<Prev in Thread] Current Thread [Next in Thread>