[pcp] braindump on unified-context / live-logging
Frank Ch. Eigler
fche at redhat.com
Tue Jan 7 19:39:56 CST 2014
Hi -
Here are some notes related our earlier unified-context ideas [1][2].
As a recap, it's desirable to teach pcp clients to locate pcp data for
arbitrary hosts, and then to easily seek between live & historical
metrics for it.
[1] http://oss.sgi.com/pipermail/pcp/2013-September/003963.html
[2] http://oss.sgi.com/pipermail/pcp/2013-November/004090.html
My proposed approach focuses on archive files and a new server for the
archive files. It might make do without a new PM_CONTEXT_* mode, and
at the PMAPI level just generalize PM_CONTEXT_ARCHIVE & _HOST a little.
1) We'd extend archive files to be usable as a source of "live" data,
so that clients can sort-of-"tail -f" the files to get current info,
or can seek along time with pmSetMode().
1.1) pmlogger needs to learn to write its output with what IIRC kenj
has referred to as "semantic units", ie., proper use sequencing of
write(2), fdatasync(), to put interdependent data on disk correctly.
1.2) libpcp needs to learn to read archives that are being written-to.
It should not freak out when the end-of-file is reached, and for
PM_MODE_LIVE, just return the then-freshest measurements and/or
trigger PM_ERR_PMDANOTREADY if clients are fetching faster than the
logger is recording. This could be done without a PM_CONTEXT_UNIFIED
extension, just permitting PM_MODE_LIVE for PM_CONTEXT_ARCHIVE, and
giving clients an option (like the -f for tail) to use that flag.
2) Because these archive files may not be local to the clients, or
because they may not already contain every metric a client might like,
we need a network server to offer them such tasty treats. Elsewhere
and elsewhen, nathans has ably argued that this shouldn't be pmcd, but
a new server or an extended pmlogger/pmproxy/pmmgr.
2.1) A new server needs to be written, which would monitor some local
archive files, and serve an extended pcp wire protocol for it (one
that includes archive-like pmSetMode operations). It would advertise
the pcp hostname (or pmmgr-style hostid) to the network, so clients
can find the right host data (probably one tcp port per archive, or
else multiplexed over a single tcp port and identifying the host
during startup negotiation). The clients could keep using
PM_CONTEXT_HOST but permit PM_MODE_BACK etc. to forward -S/-T times -
ie. no requirement for a new PM_CONTEXT_UNIFIED.
2.2) Clients would be extended with enough discovery logic to find the
network server that has data for the interested pcp hostname /
pmmgr-hostid. Or, just rely on an extended pmfind:
"pmstat -h `pmfind --hostid HOSTID`" would attach to a server that
has data for that HOSTID. (A bonus complication is having multiple
archives/servers for the same HOSTID, such as with different subsets
of metrics, or for different times: perhaps extra pmfind filtering
params.)
2.3) That server needs to be extended: merged into pmlogger, or
interfaced with pmlc, so as to arrange logging of newly requested live
data that wasn't already set up in the pre-configured set of metrics.
It could heuristically control their logging interval and duration for
multiple clients.
3) PROFIT!
- FChE
More information about the pcp
mailing list