Hi David,
----- Original Message -----
> > > [...]
> > > Incidentally, this may help to resolve some of the other pmdajson worries
> > > still in the back of my mind (which I still owe you some mail on, sorry
> > > 'bout the tardiness there - will follow up soon).
> >
> > As far as I know all the worries that you've mentioned to me in the past
> > have been addressed. If you've got some new ones, send them on and I'll
> > look at them.
>
> Oh, I was referring to my earlier comments about how there were some things
> we should defer to discussing after initial merge of pmdajson code, so that
> they didn't get in our way. [...]
>
OK - here's some background topics worth mulling over re pmdajson.
Originally, the idea was that there would be "no code" required to
add JSON metrics, when primarily systemtap was being considered as
a generic source. Then other sources were implemented and it's now
looking increasingly likely that the majority (?) of sources will
need to use data-exec to injest JSON data into pmdajson.
So, what's potentially sub-optimal about the current situation?
- Inefficient sampling
if data-exec does indeed become the way the majority of sources will
interact with pmdajson, then we'll have the least efficient method
for sampling (executing a command, parsing results) as the default.
For things like Ceph, a more ideal model would see persistent socket
connections maintained to the daemon, then protocol exchanges to get
fresh data only when needed. Currently, the data-exec model will
cause a full socket setup/teardown & initial exchanges for each and
every refresh.
- Domain isolation
We have inadvertently circumvented the checks-and-balances pmcd has
for keeping different *domains* of performance data at arms length.
What this means, in practice, is that a blocking refresh from one
domain can (ultimately) cause loss of data from other domains, i.e.
a problem on the Ceph socket might cause all systemtap metrics to
stop refreshing when pmcd terminates the tardy PMDA. Multiply this
out by more and more domains within this one 'json' domain, and it
could become quite a problem. Worse, its probably not going to be
a trivial debugging exercise to figure out which sources are at the
root of such a problem, and which are the innocent bystanders.
Similarly, the status reporting pmcd provides now for the different
agents is circumvented (eg pmcd.agent.status and its ilk), so pmie
or similar automated-failure-recovery actions are now feasible only
at the json-domain level, not the actual domain level (ceph, stap,
... etc) - and would have to be specially implemented for pmdajson.
- Refresh script complexity
"generate_ceph_metadata" script is approaching the complexity of other
script PMDAs now - in the back of my mind this is a bit of a worry, as
we were aiming to make instrumentation easier to expose here. Will a
pmdajson-metadata/data-exec-script author not need to know all details
about PCP metric descriptors anyway? (one of the trickier aspects of a
PMDA) in order to know how pmdajson is going to interpret her JSON? I
think probably, yes. (Now that I look closely again, it looks like we
have a problem in pmdajsons interpretation of counter vs instantaneous
semantics too. Will discuss separately).
- Security model
Not clear how much of an issue this will be, but the root/nobody
model is too coarse grained for some domains other PMDAs serve today
(hence I guess it'll become a problem for json domains soon enough) -
e.g. pmdapostgresql needs to run as the postgres, or some other, user
that is run-time configurable. pmda.set_user() interface has been an
appropriate level of interface so far. We could extend pmdajson to
handle this to some extent via config file extension, of course.
- and a handful of other small stuff ...
- all metric names will be json.* prefixed (people complain about this
with mmv; they want to have complete namespace control for their own
domains, oftentimes)
- additional ./Install-time customisation needed (e.g. if we consider
a pmdajson+generate_ceph_metadata vs a theoretical pmdajson+APIs-to
-help-parse-json ... it would be simpler to install the latter as it
would be one step only.
- we're not able to dynamically configure metrics or the target domain
(e.g. no pmStore(3) support or equivalent)
- we're not able to pass/interpret connection attributes (things like
authentication, targeted containers, etc) in the refresh scripts.
... some of these areas we can tackle via continued hacking on pmdajson
and extending its schema, its config file, interfaces to data-exec'ed
scripts and so on. But, I wanted to step back and think about whether
effort in core PMDA libraries might make some sense at this stage, for
some of the above items (which? -- all of the above can/are inherently
handled by separate PMDAs using JSON instead of data-exec'd scripts, of
course, its just extra effort - perhaps making that easier is a better
way to solve some of 'em, however).
To be clear, I'm not suggesting in any way that we drop pmdajson, stop
developing it, or anything crazy like that; rather that we think about
giving it some lower level API support (possibly?) and/or improving the
lower-level APIs support for JSON injest for other PMDAs too. For the
PMDAs I've written so far that consume JSON, I'm certain functionality
like jsonpointers would have greatly simplified those, so perhaps thats
the first place to start experimenting (not suggesting you do this ...
mainly just trying to solicit ideas at this stage, and share the above
list of potential pmdajson ratholes).
cheers.
--
Nathan
|