| To: | pcp@xxxxxxxxxxx |
|---|---|
| Subject: | Proposal for handling dynamic metric names (and hence dynamic metrics) |
| From: | Ken McDonell <kenj@xxxxxxxxxxxxxxxx> |
| Date: | Wed, 08 Jul 2009 18:31:51 +1000 |
| Reply-to: | kenj@xxxxxxxxxxxxxxxx |
I've been threatening to get this out for sometime now. There is no code to back any of this up (yet), it really is a proposal ... so please let me know if you think this is a good or bad idea, and holes being picked in the issues covered would be most welcome, as would better ideas. The motivation here is to get pmcd out of the way for cases where it is the PMDA that knows what metrics are available, not the PMNS loaded by pmcd, e.g. an mmv-like PMDA where the available metrics are discovered from mmap'd files when the PMDA starts. Proposal for Supporting Dynamic PCP Performance Metric NamespacesKen McDonell
kenj@xxxxxxxxxxxxxxxx Initially in PCP, the Performance Metrics Namespace (PMNS) was local to each machine where PCP was being used. This made it difficult to co-ordinate the PMNS versions on multiple monitoring machines with the PMDAs installed on the collector machines and was quickly identified as a weakness and replaced by the "Distributed PMNS" we have today where the PMNS is maintained on the PCP collector machine or within the PCP archive. Monitoring applications ship their namespace requests to the relevant source of metrics, namely a pmcd or a PCP archive. Aside from the rare use of a local PMNS (with the –n option) by PCP monitoring applications, the principal use of the PMNS is to be loaded (or reloaded) by pmcd and then used by pmcd to respond directly to remote requests from PCP monitoring applications using pmLookupName() (or the asynchronous equivalent pair pmRequestNames() and pmReceiveNames()), pmNameID() (or the asynchronous equivalent pair pmRequestNameID() and pmRecieveNameID()), pmNameAll() (or the asynchronous equivalent pair pmRequestNameAll() and pmReceiveNameAll() &mdash although the former is defined, documented but not implemented!), pmGetChildren(), pmGetChildrenStatus() (or the asynchronous equivalent pair pmRequestNamesOfChildern() and pmReceiveNamesOfChildren()) and pmTraversePMNS() (or the asynchronous equivalent pair pmRequestTraversePMNS() and pmReceiveTraversePMNS()). The PMNS on a collector machine is maintained as a single file with entries added and deleted as a part of the installation and removal of a PMDA. While this regime has served PCP well for most PMDAs, there have been a small number of cases where the static nature of the PMNS has not been appropriate, e.g.
Existing methods for handling a dynamic aspect of the PMNS are all ugly and error-prone, e.g. make a new PMNS, update the global PMNS and sent pmcd a SIGHUP signal. Proposal OverviewThe existing PMNS will be extended (no backwards compatibility issues) to introduce a new "non-terminal" node that will be used to indicate that the PMNS below this point is dynamic and defined by the associated PMDA. As an example to be used throughout this proposal, the foo PMDA (domain 44) supports dynamic names below the foo.count node in the PMNS. The relevant fragment of the ASCII PMNS would be as follows: root {
...
foo
...
}
...
foo {
version 44:0:1
count 44:*:*
memory 44:0:2
}
The foo PMDA is willing to export metadata and metric values for
the following additional (dynamic) metrics:
foo.count.ops (PMID 44:1:0) foo.count.errs (PMID 44:1:1) foo.count.numcount (PMID 44:0:27) Changes to pmcd and new interactions with the foo PMDA would mean that attempts to look up the PMIDs for metrics with names beginning foo.count. would be passed from pmcd to the foo PMDA, and similarly requests to find the names of metrics given their PMID would also be passed from pmcd to the foo PMDA if they are not resolved in the PMNS loaded into pmcd. Detailed Changes Required
Changes to the ASCII PMNS FormatAs forshadowed, the syntax :*:* after a domain number would flag a PMNS node as the root of a subtree of names to be resolved in the associated PMDA. The only place where the ASCII PMNS format is known at this level of detail is in the internal routine loadascii() of libpcp which is called from pmLoadNameSpace(), pmLoadASCIINameSpace() and pmGetPMNSLocation(). So extending the parser here is simple. Changes to the Binary PMNS FormatThe binary format of the PMNS is what is loaded into the address space after the ASCII PMNS has been parsed (it is also the format generated by pmnscomp and read by the libpcp routines, but this is just a performance short cut — pmnscomp will need almost no change as it simply writes out the binary PMNS after it has been loaded). The relevant data structure is __pmnsNode (defined in <pcp/impl.h>). Now this structure is sufficiently public that we cannot change it in any way that would break binary compatibility, and the only field avalable to encode both the PMDA's domain number and the dynamic nature of the node in the PMNS is the pmid field. Internally a pmid is structured thus (ignoring the endian alternative form): typedef struct {
int pad : 2;
unsigned int domain : 8;
unsigned int cluster : 12;
unsigned int item : 10;
} __pmID_int;
So the domain field must be used to encode the domain of the PMDA providing the dynamic names, but unfortunately there are no values for cluster and/or item that could be used to mark the node as the root of a subtree of dynamic names. By good fortune we have spare bits hiding in the pad field, so the proposal is to extend the __pmID_int struct to allocate one of the bits from pad to the new field dynamic, as follows: typedef struct {
unsigned int dynamic : 1;
int pad : 1;
unsigned int domain : 8;
unsigned int cluster : 12;
unsigned int item : 10;
} __pmID_int;
A value of 1 for
dynamic
encodes the fact that
this PMNS node is the root of a dynamic subtree.
Leaving
pad
between
domain
and
dynamic
would allow the
domain field of
a PMID to expand to 9 bits if that becomes necessary at some point in the
future.
This change does make a dynamic PMID negative when treated as a 32-bit integer
but this should not be a problem as code of the form:
if (pmid < 0) ...
is just plain wrong, and should probably be
if (pmid == PMID_NULL) ...
Internally, the PMID for a node at the base of a dynamic subtree would
be encoded as 1: Changes for pmlogger and PCP ArchivesNo changes are needed here as pmlogger tolerates missing metrics and only adds PMNS and metadata information into the PCP archives for those metrics that can be found, so the PMID for the the root of a subtree of dynamic names will never appear in an archive, although the descendent nodes (with their associated names and PMIDs) may appear in an archive. Changes for libpcp_pmdaTo support the additional interactions between pmcd and the PMDAs the pmdaInterface structure needs to be extended. This will be PMDA_INTERFACE_4, and involves adding struct { } three; to the union, with all of the fields from struct { } two; plus the following: int (*pmns_pmid)(char *, pmID *);
int (*pmns_name)(pmID, char **);
int (*pmns_children)(char *, char ***, int **);
The standard implementation of these routines should suffice for the majority of cases, but they are exposed in the interface to allow an over-riding implementation should that be necessary (this also makes them consistent with all other PDU handling routines in the PMDA library). Changes for libpcpThe table below describes the changes that are needed in various libpcp routines that are used once a PMNS is loaded (for simplicity we've omitted the asynchronous versions of these synchronous routines, but the same semantics would apply to the asynchronous versions).
A small change is needed in pmIDStr() to detect a dynamic PMID (checking if the dynamic field is 1), and then output an asterisk in place of the numeric cluster and item fields. Some LimitationsThe "dynamic" nature of the PMNS only applies to the PMNS at the time it is explored. For most monitoring tools this is at start up (typically after a configuration file has been read), so any changes to the PMNS after that point in time will not be noticed. Specifically this means:
This behaviour is no different to other existing interactions, e.g. when a PMDA is installed or removed, so is not a new issue. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | pcp updates (2.8.12), Nathan Scott |
|---|---|
| Next by Date: | Re: [pcp] Proposal for handling dynamic metric names (and hence dynamic metrics), Mark Goodwin |
| Previous by Thread: | pcp updates (2.8.12), Nathan Scott |
| Next by Thread: | Re: [pcp] Proposal for handling dynamic metric names (and hence dynamic metrics), Mark Goodwin |
| Indexes: | [Date] [Thread] [Top] [All Lists] |