pcp
[Top] [All Lists]

Re: [pcp] suitability of PCP for event tracing

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>, Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: [pcp] suitability of PCP for event tracing
From: nathans@xxxxxxxxxx
Date: Wed, 15 Sep 2010 09:52:52 +1000 (EST)
Cc: pcp@xxxxxxxxxxx, systemtap@xxxxxxxxxxxxxxxxxx, Max Matveev <makc@xxxxxxxxx>
In-reply-to: <370101302.980391284507268422.JavaMail.root@xxxxxxxxxxxxxxxxxx>
Sender: nscott@xxxxxxxxxx
----- "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx> wrote:

> >   kenj>  Some of the suggestions to date include ...

I'd like inject another couple of suggestions, that are of
particular interest to me.

1. I'd like to see a capability for "end-to-end" tracing with
   a parent/child notion as in several other distributed trace
   frameworks.  This notion calls for a unique identifier to
   be associated with each trace, and each trace to either be
   the root of a graph, and/or have the id of a parent trace
   associated with it, with the option of these ids being able
   to move across hosts (when supporting instrumentation is in
   place).  Not clear whether this should be enforced for all
   traces, the unique ID part I mean, of opt-in.  The current
   libpcp_trace could get a head transplant to support this,
   the current code there is basically superseded by libpcp_mmv
   now anyway, so a rethink there would be timely.

   These papers cover the sort of capability that I would like
   to see us tackle...
http://www.cs.berkeley.edu/~rfonseca/pubs/xtr-nsdi07.pdf
http://sns.cs.princeton.edu/docs/xtrace-inm10.pdf
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/papers/dapper-2010-1.pdf
http://msdn.microsoft.com/en-us/magazine/cc163437.aspx

2. Ability to store traces in PCP logs, and have the same APIs
   access them as in live mode, and alongside regular (sampled)
   metric values.  Not clear if the current format is well suited
   for that (might need extensions, additional index,... dunno),
   whether we might need a new metric type for traces so we can
   know to trigger callbacks on fetch, etc.

One other random comment - wrt your code snippet, Frank, it'd
probably be more consistent to do the timeout/interval setting
via pmSetMode.  The other async requests that Greg/Max did do
not have an opaque void* (passthru) parameter either ... so,
just need to think about whether we want consistency or not
there (and whether than concept needs to be available in those
other async calls).

cheers.

-- 
Nathan

<Prev in Thread] Current Thread [Next in Thread>