Hi Frank,
Thanks for coming back to this - good to have others pondering it too!
----- Original Message -----
> Hi -
>
> Here are some notes related our earlier unified-context ideas [1][2].
> As a recap, it's desirable to teach pcp clients to locate pcp data for
> arbitrary hosts, and then to easily seek between live & historical
> metrics for it.
>
> [1] http://oss.sgi.com/pipermail/pcp/2013-September/003963.html
> [2] http://oss.sgi.com/pipermail/pcp/2013-November/004090.html
>
> My proposed approach focuses on archive files and a new server for the
> archive files. It might make do without a new PM_CONTEXT_* mode, and
> at the PMAPI level just generalize PM_CONTEXT_ARCHIVE & _HOST a little.
Its not really clear what the reasons for not liking a new context
type are. The description starts out here with "just" & "a little"
... so I'm thinking perhaps it is perceived as overly complex? By
the end here though, we have a new protocol rev, new server, a fair
bit of new code in pmlogger, libpcp, and all the clients - quickly
this has become quite complex too. Its a hard problem.
An approach that requires a server is not ideal too - often times
we will be working with only archives, on systems removed from the
recorded production systems. It is ideal if client tools continue
to operate in this isolated way also. Not 100% sure if a server
was *required* with this new approach, but kinda sounded like it -
I'm not a fan if so.
Having said that, there's areas of overlap ...
>
> 1) We'd extend archive files to be usable as a source of "live" data,
> so that clients can sort-of-"tail -f" the files to get current info,
> or can seek along time with pmSetMode().
...like this bit; this is something that both approaches benefit from.
Its also something that could be more deeply investigated right away,
to determine the extent of the problem. Mark mentioned that the old
pmchart (original SGI version) was able to do this - via handling the
PM_ERR_EOL return code from pmFetch and dealing with it (somehow). A
solution with more help from libpcp would be good, if thats possible;
needs further investigation though.
See bullet point #3 from reference [1] above. Same same.
> 1.1) pmlogger needs to learn to write its output with what IIRC kenj
> has referred to as "semantic units", ie., proper use sequencing of
> write(2), fdatasync(), to put interdependent data on disk correctly.
Its more complicated than this though, I think. IIUC, one of the big
differences between what you're describing & the grand-unified-context
theory is that there is no transition-to-live-mode here - everything
has to be read from the log (possibly by an intermediary daemon rather
than directly from the file, but still read-from-log). Is that right?
As an aside, not sure fdatasync will help us here (unless we're also
becoming more concerned with on-disk integrity, but we'd need to do a
whole lot of unrelated work - journalling etc, ondisk format change).
Both pmlogger (writing) and PMAPI clients (reading) are accessing the
data from the page cache. So file locking might be needed, and some
kind of mechanism where the client can *tell* pmlogger it needs to
flush its buffers, but thats different to fsync. </aside>
> 1.2) libpcp needs to learn to read archives that are being written-to.
As above, it can ... perhaps it needs to learn more. Needs analysis,
we're guessing a bit as to what state its in now.
> It should not freak out when the end-of-file is reached, and for
"freak out" == PM_ERR_EOL from pmFetch, I think, but maybe its more
freaky than that - we don't know, but Marks anecdotal evidence is
suggesting it is perhaps not as bad as earlier thought.
> PM_MODE_LIVE, just return the then-freshest measurements and/or
> trigger PM_ERR_PMDANOTREADY if clients are fetching faster than the
> logger is recording.
That doesn't sound much different from what it does now? What the
clients do next, and whether that can be done in libpcp rather than
individually in each tool is an open question I think.
> This could be done without a PM_CONTEXT_UNIFIED
> extension, just permitting PM_MODE_LIVE for PM_CONTEXT_ARCHIVE, and
> giving clients an option (like the -f for tail) to use that flag.
This doesn't seem to consider two aspects that grand-unified-contexts
are attempting to address:
- the historical data will usually span many archives not just one;
- the pmNewContext(PM_CONTEXT_ARCHIVE) API semantics require a file
path to be passed in specifying the (one) archive to use, and not
a host specifier - I guess we would have to use the hostname from
that archive, then do pmfind-style discovery for servers that know
about that host? (how to authenticate though, which will be needed
for sure if we're talking about modifying the logged set too?)
These do seem to be things that must be solved by whatever approach
we take too, so perhaps expand on these aspects more for us?
> 2) Because these archive files may not be local to the clients, or
> because they may not already contain every metric a client might like,
> we need a network server to offer them such tasty treats. Elsewhere
> and elsewhen, nathans has ably argued that this shouldn't be pmcd, but
> a new server or an extended pmlogger/pmproxy/pmmgr.
Whatever we decide to do, I'd like to avoid new daemons if we can - we
have six (!) init scripts now, its getting out of hand. So a pmproxy
extension/rewrite gets my vote if we must talk to a daemon for remote
archive data. We already need to add authentication to pmproxy, and
proxying archives doesn't seem a huge stretch beyond the host proxying
that it already does.
> 2.1) A new server needs to be written, which would monitor some local
> archive files, and serve an extended pcp wire protocol for it (one
> that includes archive-like pmSetMode operations). It would advertise
> the pcp hostname (or pmmgr-style hostid) to the network, so clients
> can find the right host data (probably one tcp port per archive, or
> else multiplexed over a single tcp port and identifying the host
> during startup negotiation). The clients could keep using
> PM_CONTEXT_HOST but permit PM_MODE_BACK etc. to forward -S/-T times -
> ie. no requirement for a new PM_CONTEXT_UNIFIED.
We also need to add discovery to pmproxy (already). We may need to
rework pmproxy to deal with slower I/O needs of serving logs (maybe
thread it, and make it follow a request-callback-completion style);
that will benefit the existing multiple-hosts-proxying as well.
> 2.2) Clients would be extended with enough discovery logic to find the
> network server that has data for the interested pcp hostname /
> pmmgr-hostid. Or, just rely on an extended pmfind:
> "pmstat -h `pmfind --hostid HOSTID`" would attach to a server that
> has data for that HOSTID. (A bonus complication is having multiple
> archives/servers for the same HOSTID, such as with different subsets
> of metrics, or for different times: perhaps extra pmfind filtering
> params.)
>
> 2.3) That server needs to be extended: merged into pmlogger, or
> interfaced with pmlc, so as to arrange logging of newly requested live
> data that wasn't already set up in the pre-configured set of metrics.
> It could heuristically control their logging interval and duration for
> multiple clients.
This bit sounds quite complex - AIUI it'll need to manage multiple users
requests (authenticated) which may request the system pmlogger(s) to log
arbitrary metrics at arbitrary frequencies. I think the unified context
approach dealt with this more simply where different users archives were
kept in their own home directories, managed by separate pmloggers, with
no extra magic needed (no new auth issues, and issues around users over-
loading the system logger / reserved filesystem space).
Having said that, its good to have different ideas and more options, cos
this is all going to be quite tricky to implement. :) There are several
aspects to both the above and that earlier grand-unified-context mail
that were vague or glossed over. At this stage it feels like some early,
basic, maybe even throw-away kind of coding might help flesh out some of
those unknowns for us & better inform the designs we come up with.
cheers.
--
Nathan
|