pcp
[Top] [All Lists]

Re: PCP Updates: Active Probing for __pmDiscoverServices() / pmfind

To: Dave Brolley <brolley@xxxxxxxxxx>
Subject: Re: PCP Updates: Active Probing for __pmDiscoverServices() / pmfind
From: fche@xxxxxxxxxx (Frank Ch. Eigler)
Date: Mon, 19 May 2014 17:57:02 -0400
Cc: PCP Mailing List <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <5373D0D2.5090902@xxxxxxxxxx> (Dave Brolley's message of "Wed, 14 May 2014 16:23:46 -0400")
References: <5373D0D2.5090902@xxxxxxxxxx>
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)
Hi, Dave -

> The commits below have been pushed to my brolley/dev branch in the
> pcpfans repository. They collectively represent an implementation of
> an active probing mode for __pmDiscoverServices() / pmfind.

Thank you, great start.

  
> [...]  In order that the probing be completed in a reasonable amount
> of time, each connection is attempted on its own thread (on
> platforms that support pthreads). [...]

The way you did this appears to work, but a few things could simplify 
it a lot:

Instead of creating one brand new thread per connection poll, consider
creating a persistent pool.  

One shared struct context for all the threads could do:


  struct context {
    struct timeval timeout_expiry;
    __pmSockAddr *last_address;    

    __pmSockAddr *next_address;
    pthread_mutex next_address_lock;

    char ***urls; /* etc., for collecting outputs */
    pthread_mutex urls_lock;
  };


Each thread could have a main function consisting of:

  while (! timeout) { 
         lock (next_address_lock);
         if (next-address == NULL) { /* search was completed by someone else */
               unlock (next_address_lock);
               break;
         }
         save next-address for self,
         next_address = __pmSockAddrNextSubnetAddr (next_address ...);
                        /* might become NULL */
         unlock (next_address_lock);

         attemptConnection on saved address;
         /* IMHO, don't even bother retry-loop; let TCP do that. */

         if (successful) {
            lock(urls);                    
            __pmAddDiscoveredService(urls);
            unlock(urls);
         }
   }


The idea would be that the threads would keep contending over the
next-address and for the url, as long as there's time available, and
the search is not yet finished.  This would make the
dispatchConnection() function largely moot.  No <sem.h> stuff either.

The probeForServices() function could instead initialize the single
context struct, create the thread-pool, and just call the thread work
function itself in a loop until that exits (i.e., main thread can race
with the worker threads).  Then it can pthread_join() all the worker
threads for perfection in cleanup, and then return the collected urls
from the context.

This logic would work even if threading is compiled out, or if some of
the threads are couldn't be started (so no need to retry them either).


- FChE

<Prev in Thread] Current Thread [Next in Thread>