nathans wrote:
> [...]
> An approach that requires a server is not ideal too - often times
> we will be working with only archives, on systems removed from the
> recorded production systems. It is ideal if client tools continue
> to operate in this isolated way also. Not 100% sure if a server
> was *required* with this new approach, but kinda sounded like it -
> I'm not a fan if so.
The idea was to break the overall problem into two overall parts: (1)
for a pcp client to be able to use any archive as a liveish data
source and (2) have a server control and return data from remote
archives. (1) sounds like what you are referring to here. Note that
for (2), I did not mean to suggest that this needs to be a system
server - it could just as easily be unprivileged & per-user.
> [...]
>> 1.1) pmlogger needs to learn to write its output with what IIRC kenj
>> has referred to as "semantic units", ie., proper use sequencing of
>> write(2), fdatasync(), to put interdependent data on disk correctly.
>
> Its more complicated than this though, I think. IIUC, one of the big
> differences between what you're describing & the grand-unified-context
> theory is that there is no transition-to-live-mode here - everything
> has to be read from the log (possibly by an intermediary daemon rather
> than directly from the file, but still read-from-log). Is that right?
Yes, but transition to "liveness" is sort of orthogonal:
both -a and -h would support pmSetMode(PM_MODE_LIVE), clients would just
need to ask for it.
> As an aside, not sure fdatasync will help us here [...] So file
> locking might be needed, and some kind of mechanism where the client
> can *tell* pmlogger it needs to flush its buffers, but thats
> different to fsync. </aside>
(Right; mgoodwin's idea to have pmlogger frequently update the
log-label would be another option.)
> This [step 1, live-read archives] doesn't seem to consider two
> aspects that grand-unified-contexts are attempting to address:
>
> - the historical data will usually span many archives not just one;
> - the pmNewContext(PM_CONTEXT_ARCHIVE) API semantics require a file
> path to be passed in specifying the (one) archive to use [...]
Yes, this is why part (1), the generalized PM_CONTEXT_ARCHIVE, would
only be an intermediate step to the overall plan.
Onto part (2), the generalized PM_CONTEXT_HOST.
>> 2) Because these archive files may not be local to the clients, or
>> because they may not already contain every metric a client might like,
>> we need a network server to offer them such tasty treats. Elsewhere
>> and elsewhen, nathans has ably argued that this shouldn't be pmcd, but
>> a new server or an extended pmlogger/pmproxy/pmmgr.
>
> Whatever we decide to do, I'd like to avoid new daemons if we can - we
> have six (!) init scripts now, its getting out of hand.
(Daemons don't have to be system-wide, nor have lifetimes/existence
independent from the others.)
> So a pmproxy extension/rewrite gets my vote if we must talk to a
> daemon for remote archive data. [...]
Yes, that's one possible spot, which has the head-start of already
speaking PMAPI-across-the-wire.
> [...] This bit sounds quite complex - AIUI it'll need to manage
> multiple users requests (authenticated) which may request the system
> pmlogger(s) to log arbitrary metrics at arbitrary frequencies. I
> think the unified context approach dealt with this more simply where
> different users archives were kept in their own home directories,
> managed by separate pmloggers, with no extra magic needed (no new
> auth issues, and issues around users over- loading the system logger
> / reserved filesystem space).
The scenarios can differ only in terms of configuration, not
architecture/mechanism! *some* pmlogger would be controlled by the
pmproxy (sp?) to gather more or less data, as per PMAPI clients'
requests. It does not have to be *the* system pmlogger; it could be
some random personal one run by some personal pmmgr. The files don't
have to be under system dirs.
The idea is to glue together a few orthogonal facilities:
1) the ability to serve archive data via PMAPI-across-the-wire
2) the ability to advertise/virtually-join multiple archives
3) the ability to supervise an archiver to influence future logging
activity in response to PMAPI-across-the-wire
> [...] At this stage it feels like some early, basic, maybe even
> throw-away kind of coding might help flesh out some of those
> unknowns for us & better inform the designs we come up with.
Could well be!
- FChE
|