----- Original Message -----
> ----- Original Message -----
> > Hi -
> > [...]
>
> Thanks for looking into the graphite support, haven't reviewed it yet
> but have fwd'd your post on to several people who have asked for this
> and feedback has been universally positive & grateful! :)
>
Thanks for sending this early for review - here's some
initial feedback...
Usage:
The explanatory text usage message issue is a missing
piece of functionality in the underlying APIs I s'pose -
the attached patch may be one way we could tackle this
(untested at this stage, but I'll happily take that on
if you think this approach might suit the need here?)
Units:
The handling of units conversion feels problematic. We
have potentially large numbers of very different metrics
we'll want to pump through this into graphite, but just
the one units specification for all of them (AIUI) via:
self.opts.pmSetLongOption("units", 1, 'u', '', "rescale all metric units
(e.g., \"mbyte/sec\")")
It seems like we'll need a more expressive configuration
mechanism. Instead of having the command line specify a
single units and then many metrics, how about a config
file which allows all metrics to be specified, and also
provides options for conversion of individual metrics to
specific units? Also, the default could be to convert
to canonical units across the board (seconds, bytes, etc)
- this is what pmie does IIRC.
Daemon:
Following on from that, I imagine users will want to run
this as a daemon, often, feeding data constantly (hence
the auto-socket-reconnect you've added I guess?). Thus,
a configuration file seems appropriate, but we'll also
need to think about init scripts down the track as well.
Rate Conversion:
The "mbyte/sec" reference in the above code line I quoted
suggests rate conversion of counter metrics can be done,
but I don't see that anywhere. We'll need to add in some
memory about the previous fetch - aaand hello complexity. :|
At this point, I starting to wonder if maybe this needs to
be encapsulated and supported by external module (see below)
since its getting complex for what we'd like to be a simple
script. The existing (unused) pmcc module seems like it'll
be a suitable helper module for this script, to handle that
side of things (and perhaps lots more, over time) - it has
the concept of the-previous-sampled-value already. It is
still simplistic though - noones using it yet AFAICT, so you
will be breaking new ground there I think.
Archives:
I think this will just work, as is now - in the last round
of changes, the time window handling and pmSetMode was added
in as default-to-on, so this:
# self.opts.pmSetLongOptionArchive() # -a FILE -- not supported yet; need
-S/etc. controls
should just work, and you should be able to just add in the
start/finish/align options, and they should automatically do
the right thing. This will help when it comes to automated
testing - but it'll also be useful for handling exporting in
the case that the graphite server was done, during an outage
or something (we used to hit that with our data warehouse
exporting, all the time - in a previously life).
Generality:
Marko pointed us toward Sensu, which looks like it has a
very similar interface to Graphite for its importing of new
metrics. This makes me think we should push as much of the
config file handling as we can into the shared pmcc module,
so we can write an exporter for that tool also (and likely
others later), sharing as much code as possible. Callbacks
maybe (or inheritance, or whatever), for setting behaviour
when converting metric/instance names to local conventions,
would be a good way to go.
Instances:
While we're talking instance names, I noticed the call to
pmNameIndom in the code that cracks open the fetch result.
Thats a round trip for each instance name - caching this is
a good idea. Perhaps pmcc could be taught to help us out
there as well (its caching values already). Maybe something
it already does, not sure - haven't looked closely enough -
but it would be handy.
Testing:
Its very early to have added this, I fully understand - just
want to give suggestions though. As mentioned earlier, the
archive mode could help here - a test which replays from an
archive, and runs nc(1) to capture the output would be nice
and deterministic.
OTOH, instead of nc(1) it may be better to write a python
nc-alike script so both pickled and text modes can be QA'd?
Unrelated, but any config file bits that we can make generic
and add to pmcc.py would be testable by a targetting script,
like Stan did with the pmapi API test. I think there is an
early attempt at a pmcc test, but its gonna need some love,
and is not run as part of the automated QA yet - it'd be
great to get it going, since we ship that module already. :|
Packaging:
I was a bit surprised to see this done as a pcp(1) command,
its more a "native" PCP command than a wrapped system command
- I'd recommend it live in /usr/bin - front-and-centre, such
that people get it on their default path for nice easy access
(rather than hidden away in a non-name-conflicting place like
PCP_BINADM_DIR).
We also have several import scripts (collectl2pcp, mrtg2pcp,
and so on) - this is the first export script, so a name like
/usr/bin/pcp2graphite would be ideal. As you mention in the
blog post, a graphite2pcp might well pop into existence at
some point too. These have tended to be packaged in separate
RPMs - we could either go for one pcp-graphite or separate
graphite2pcp and pcp2graphite packages here. Since an init
script seems likely, and its only really for graphite folks,
I'd prefer not to have it reside in the core pcp package.
pmParseUnitStr:
Yep, agreed - we should do this in libpcp as you say, and
percolate it up through the python layers.
Oh finally, I usually forget then Stan reminds me - pylint.
It'll complain about the length of the __init__ function for
sure, probably a bunch of other stuff that I'm not noticing,
its fairly picky.
cheers.
--
Nathan
getopts-descriptions.patch
Description: Text Data
|