On 07/22/2015 02:16 AM, Nathan Scott wrote:
> Hi David,
>
> ----- Original Message -----
>>>> [...]
>>>> Incidentally, this may help to resolve some of the other pmdajson worries
>>>> still in the back of my mind (which I still owe you some mail on, sorry
>>>> 'bout the tardiness there - will follow up soon).
>>>
>>> As far as I know all the worries that you've mentioned to me in the past
>>> have been addressed. If you've got some new ones, send them on and I'll
>>> look at them.
>>
>> Oh, I was referring to my earlier comments about how there were some things
>> we should defer to discussing after initial merge of pmdajson code, so that
>> they didn't get in our way. [...]
>>
>
> OK - here's some background topics worth mulling over re pmdajson.
>
> Originally, the idea was that there would be "no code" required to
> add JSON metrics, when primarily systemtap was being considered as
> a generic source. Then other sources were implemented and it's now
> looking increasingly likely that the majority (?) of sources will
> need to use data-exec to injest JSON data into pmdajson.
If your question is "will the majority of souces need to use data-exec
to get refreshed JSON data?", the answer is "it depends on the source".
I'd guess that we'll see a 50/50 split between sources that need to run
a command to refresh their JSON data (data-exec) and sources that will
get their data from a HTTP "get" operation. Obviously it depends on the
set of sources we want to support.
If you think about it, the above makes sense. The data source needs to
know somehow when to generate the JSON data - in this case through a
socket read or HTTP request. Systemtap has the luxury of using procfs
files, so it gets notified when the file is read.
> So, what's potentially sub-optimal about the current situation?
>
> - Inefficient sampling
> if data-exec does indeed become the way the majority of sources will
> interact with pmdajson, then we'll have the least efficient method
> for sampling (executing a command, parsing results) as the default.
>
> For things like Ceph, a more ideal model would see persistent socket
> connections maintained to the daemon, then protocol exchanges to get
> fresh data only when needed. Currently, the data-exec model will
> cause a full socket setup/teardown & initial exchanges for each and
> every refresh.
In the case of Ceph, I don't believe whatever protocol the ceph daemon
speaks through that socket is supposed to be accessed directly. You are
supposed to access it through the "ceph" command.
There is a ceph "rest" api that might be better suited for longer term
connections, although when you get down to it I believe it ends up
calling similar code to get the same values. I'm not sure of the level
of effort required to make this change.
> - Domain isolation
> We have inadvertently circumvented the checks-and-balances pmcd has
> for keeping different *domains* of performance data at arms length.
>
> What this means, in practice, is that a blocking refresh from one
> domain can (ultimately) cause loss of data from other domains, i.e.
> a problem on the Ceph socket might cause all systemtap metrics to
> stop refreshing when pmcd terminates the tardy PMDA. Multiply this
> out by more and more domains within this one 'json' domain, and it
> could become quite a problem. Worse, its probably not going to be
> a trivial debugging exercise to figure out which sources are at the
> root of such a problem, and which are the innocent bystanders.
The blocking refresh problem sounds like a pcp generic problem that the
json pmda just happens to exercise. Any pmda that runs a command to get
some/all of its metrics has the exact same problem.
As far as fixing this from within the JSON pmda goes, the thing that
pops into my head would be to poll for the data at a user-specified
interval, then when a request comes in give the data from the last poll.
> Similarly, the status reporting pmcd provides now for the different
> agents is circumvented (eg pmcd.agent.status and its ilk), so pmie
> or similar automated-failure-recovery actions are now feasible only
> at the json-domain level, not the actual domain level (ceph, stap,
> ... etc) - and would have to be specially implemented for pmdajson.
>
> - Refresh script complexity
> "generate_ceph_metadata" script is approaching the complexity of other
> script PMDAs now - in the back of my mind this is a bit of a worry, as
> we were aiming to make instrumentation easier to expose here. Will a
> pmdajson-metadata/data-exec-script author not need to know all details
> about PCP metric descriptors anyway? (one of the trickier aspects of a
> PMDA) in order to know how pmdajson is going to interpret her JSON? I
> think probably, yes. (Now that I look closely again, it looks like we
> have a problem in pmdajsons interpretation of counter vs instantaneous
> semantics too. Will discuss separately).
Let's start here with making sure you understand how
"generate_ceph_metadata" works. You run it once, and it uses a ceph
command to dump the JSON metric schema. The script then takes that
schema and turns it into JSON pmda metadata. You never have to run the
"generate_ceph_metadata" script again (until your version of ceph
changes at least).
Note that "generate_ceph_metadata" is probably an outlier as being a bit
tricky. The JSON schema/metadata produced by ceph is *quite* odd,
especially when it comes to types. the biggest issue is that ceph uses
very non-JSON-like type specifiers.
As to your question of "how much does someone wanting to support a new
JSON data source have to know?", the answer is "just enough". This
person would need to understand how to get his data source to produce
JSON, understand the JSON format, and understand JSON pointers. He
really wouldn't need to understand too much about PCP.
> - Security model
> Not clear how much of an issue this will be, but the root/nobody
> model is too coarse grained for some domains other PMDAs serve today
> (hence I guess it'll become a problem for json domains soon enough) -
> e.g. pmdapostgresql needs to run as the postgres, or some other, user
> that is run-time configurable. pmda.set_user() interface has been an
> appropriate level of interface so far. We could extend pmdajson to
> handle this to some extent via config file extension, of course.
>
> - and a handful of other small stuff ...
> - all metric names will be json.* prefixed (people complain about this
> with mmv; they want to have complete namespace control for their own
> domains, oftentimes)
> - additional ./Install-time customisation needed (e.g. if we consider
> a pmdajson+generate_ceph_metadata vs a theoretical pmdajson+APIs-to
> -help-parse-json ... it would be simpler to install the latter as it
> would be one step only.
> - we're not able to dynamically configure metrics or the target domain
> (e.g. no pmStore(3) support or equivalent)
> - we're not able to pass/interpret connection attributes (things like
> authentication, targeted containers, etc) in the refresh scripts.
>
> ... some of these areas we can tackle via continued hacking on pmdajson
> and extending its schema, its config file, interfaces to data-exec'ed
> scripts and so on. But, I wanted to step back and think about whether
> effort in core PMDA libraries might make some sense at this stage, for
> some of the above items (which? -- all of the above can/are inherently
> handled by separate PMDAs using JSON instead of data-exec'd scripts, of
> course, its just extra effort - perhaps making that easier is a better
> way to solve some of 'em, however).
>
> To be clear, I'm not suggesting in any way that we drop pmdajson, stop
> developing it, or anything crazy like that; rather that we think about
> giving it some lower level API support (possibly?) and/or improving the
> lower-level APIs support for JSON injest for other PMDAs too. For the
> PMDAs I've written so far that consume JSON, I'm certain functionality
> like jsonpointers would have greatly simplified those, so perhaps thats
> the first place to start experimenting (not suggesting you do this ...
> mainly just trying to solicit ideas at this stage, and share the above
> list of potential pmdajson ratholes).
It sounds like what this boils down to is a problem with one of the
basic features of the JSON pmda - the fact that it uses JSON pointers to
generically identify where to find the JSON data. Therefore the JSON
pmda can support multiple data sources at the same time.
If this is now seen as a problem, one idea would be to "break up" the
JSON pmda a bit, and move a good bit of its functionality into a python
library. Then several pmdas could use the python library to export data
for their particular source. This would solve several of your worries,
like domain isolation and wanting different top level domains. At a
first cut you'd have 2 new pmdas, a systemtap one and a ceph one that
were both thin wrappers around the python library. I'm not sure what the
level of effort would be there.
If later you wanted to rewrite bits of the python library into C to
support C clients (and then the python library would just wrap around
the C layer), that might be doable.
--
David Smith
dsmith@xxxxxxxxxx
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)
|