pcp
[Top] [All Lists]

a jsonpointer-based alternative format for json-pmda metadata

To: dsmith@xxxxxxxxxx
Subject: a jsonpointer-based alternative format for json-pmda metadata
From: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Date: Tue, 13 Jan 2015 11:41:48 -0500
Cc: pcp developers <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mutt/1.4.2.2i
Hi -

With the recent python-pmda pmns-flexibility improvements, we can
revisit the old thread [1] about the metadata/schema format for the
json-pmda.  When we left off, the prototype was based on a slightly
enlarged json-schema [2] format.  It's kind of attractive because it
defines a simple mapping from json payload to metadata, by mirroring its
nesting structure to some extent.

Since then, I've come across a widget called jsonpointers.  This is a
standardized notation for identifying parts of a json payload via a
string syntax; python libraries for it are easily available.  The nice
thing about this is that it would allow our pcp-json metadata to be
focused on just what's needed to extract pcp metrics from an arbitrary
json document: no fluff.

Here's a hypothetical rewriting of the metadata [4] of dsmith's original
prototype.  (The payload [5] could be unmodified, but let's imagine
that the "generation":1 and "data": { } wrappers are removed, and
write a metadata for that variant.)  It should give the same pminfo
output (except for extra units included here).

% cat stap_json.json
{ "pcp-metrics":[
        {"pmns":        "json.xstring",     # metric name
         "pointer":     "/xstring",         # jsonpointer into json payload
         "type":        "string"},          # pmDesc; default units/semantics

        {"pmns":        "json.read_count",
         "pointer":     "/read_count",
         "type":        "int64",
         "units":       "bytes/sec"},       # (extra: feed to pmParseUnitsStr)

        {"pmns":        "json.dummy2",
         "pointer":     "/dummy2",
         "type":        "string"},

        {"pmns":        "json.dummy_array.dummy2",
         "indom-str":   "/dummy_array/-/__id",    # identify indom field
         "pointer":     "/dummy_array/-/dummy2",  # use - as array-index 
jsonpointer
         "type":        "string"},

        {"pmns":        "json.dummy_array.dummy1",
         "indom-str":   "/dummy_array/-/__id",
         "pointer":     "/dummy_array/-/dummy1",
         "type":        "int64",
         "semantics":   "counter"},

        {"pmns":        "json.net_xmit_data.xmit_latency",
         "indom-str":   "/net_xmit_data/-/__id",
         "pointer":     "/net_xmit_data/-/xmit_latency",
         "description": "sum of latency for xmit device",
         "units":       "ms",               # (extra)
         "type":        "int64"},

        {"pmns":        "json.net_xmit_data.xmit_count",
         "indom-str":   "/net_xmit_data/-/__id",
         "pointer":     "/net_xmit_data/-/xmit_count",
         "description": "number of packets for xmit device",
         "type":        "int64",
         "semantics":   "counter"}
],
"pmns-prefix":"stap_json"    # in absence, default to the metadata file basename
}


This JSON-formatted metadata can be easily consumed by a python script.
It would construct the PMNS from the obvious fields.  (The metadata file
may nominate a prefix, to make it possible for stap-generated metadata
files to hide their clumsy stap_XXXXX names.)

Each metric value would be found by jsonpointer-dereferencing the
"pointer" field against the json payload file.  The only tricky aspect
is the indoms/arrays.  The above proposal uses the "-" jsonpointer
syntax to identify the (sole) indexing dimension that is to be turned
into a pcp instance-domain; the python script would iterate 0... along
that array index to enumerate the actual indom & values.  (Note that
in this model, the __id parameter is not hard-coded in python, and
pure-numeric indoms fit easily.)

As a further step, it could simplify the json-pmda configuration if
the metadata file contained within it instructions as to where to
fetch the json payload:

          "payload-file":"foo.json"
or even   "payload-exec":"ceph perf metric"
or even   "payload-url":"http://localhost:1235/foo-metrics";

then the json-pmda would need to be configured only with a list of
metadata files/directories, and it can find the payload files by
itself.


[1] https://sourceware.org/ml/systemtap/2014-q3/msg00302.html
[2] http://json-schema.org/
[3] https://python-json-pointer.readthedocs.org/en/latest/mod-jsonpointer.html
[4] https://sourceware.org/ml/systemtap/2014-q3/txtRzjotldCEn.txt
[5] https://sourceware.org/ml/systemtap/2014-q3/txtfRBQXZpuUt.txt

<Prev in Thread] Current Thread [Next in Thread>