> [...]
> 3. Amazon Web Services [Chandana]
> PCP model of remote loggers which explicitly know about all of the
> hosts they need to log is mismatched to the needs of monitoring in
> the AWS space. Here, hosts can be spun up and down in relatively
> short time spans, and the remote pmlogger "pull" model is not what
> is wanted - a "push" model where the host starts up and starts to
> broadcast out data (including its hostname) to something listening
> for such traffic is offered by collectd and is better suited.
I wonder why having a central pmlogger pulling from these short-lived
targets is deemed unworkable. One might imagine pmlogger made more
able to respond to quick changes in configuration (the addition of
remote-machine logging configuration fragments); or letting it be
statically pre-configured with a large space (/24 network?) of
potential-targets for it to poll.
> Discussion about how to tackle similar functionality ensued - the
> use of a local pmlogger on each dynamic AWS host agreed as a good
> first step and then two approaches discussed. Ken and Chandana
> pondered changes to pmlogger to allow arbitrary pluggable backends
> via a new API to plugin anything (e.g. a streaming-to-remote-AWS-
> host-listener plugin). [...]
As an alternative, I pointed to NFS as a possible transport for
pushing pmlogger data across to a central head node, which would
work with the present pcp code base.
> Frank pointed out this results in loss of the ability to run all
> of the PCP analytic tools on historic data, however, and suggested
> an alternate scheme where we do a better job of allowing the PMAPI
> to access PCP logs as they are being written. This would be more
> compelling in that it offers potential improvements to existing PCP
> tools like pmchart too, which currently do relatively poorly in the
> way they manage live/archive transition.
Right, this is the part where Ken suggested this would not be too
hard, by buffering outgoing archive PDU's more "semantically". It
might be helpful to have an extra pmlogger archive file while it's
being live-updated, as a tiny table-of-contents of sorts, which live
clients might more efficiently monitor than the main time-series data
files.
The streaming-to-remote-AWS-host-listener tool be written like an
ordinary PMAPI client (rather than a logger plugin), whether connected
to a pmcd, or a newfangled-live-monitored archive.
> Ryan points out the pain of starting pmchart and having to wait for
> live sampled data - he'd prefer the tool to be able to seamlessly
> fetch history for metrics selected for live plotting. Would require
> logging everything all the time [...]
Well, not everything, not all the time, necessarily. There could be
an opportunistic pmlogger instance started up to feed a pmchart, which
accumulates data only for that pmchart. It could heuristically save
more metrics than initially requested -- or just monotonically grow
the set as the pmchart user browses more and more.
- FChE
|