----- "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx> wrote:
> > kenj> Some of the suggestions to date include ...
I'd like inject another couple of suggestions, that are of
particular interest to me.
1. I'd like to see a capability for "end-to-end" tracing with
a parent/child notion as in several other distributed trace
frameworks. This notion calls for a unique identifier to
be associated with each trace, and each trace to either be
the root of a graph, and/or have the id of a parent trace
associated with it, with the option of these ids being able
to move across hosts (when supporting instrumentation is in
place). Not clear whether this should be enforced for all
traces, the unique ID part I mean, of opt-in. The current
libpcp_trace could get a head transplant to support this,
the current code there is basically superseded by libpcp_mmv
now anyway, so a rethink there would be timely.
These papers cover the sort of capability that I would like
to see us tackle...
http://www.cs.berkeley.edu/~rfonseca/pubs/xtr-nsdi07.pdf
http://sns.cs.princeton.edu/docs/xtrace-inm10.pdf
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/papers/dapper-2010-1.pdf
http://msdn.microsoft.com/en-us/magazine/cc163437.aspx
2. Ability to store traces in PCP logs, and have the same APIs
access them as in live mode, and alongside regular (sampled)
metric values. Not clear if the current format is well suited
for that (might need extensions, additional index,... dunno),
whether we might need a new metric type for traces so we can
know to trigger callbacks on fetch, etc.
One other random comment - wrt your code snippet, Frank, it'd
probably be more consistent to do the timeout/interval setting
via pmSetMode. The other async requests that Greg/Max did do
not have an opaque void* (passthru) parameter either ... so,
just need to think about whether we want consistency or not
there (and whether than concept needs to be available in those
other async calls).
cheers.
--
Nathan
|