In cleaning up the event record stuff and pushing some generic
functionality into libpcp and libpcp_pmda I've stumbled across something
that is broken in various ways in lots of places.
The issue is with PM_TYPE_AGGREGATE (but both of the related types
PM_TYPE_AGGREGATE_STATIC and PM_TYPE_EVENT are OK) and the way
__pmStuffValue is called from pmdaFetch in libpcp_pmda.
The aggr_len argument to __pmStuffValue is only ever used in the
PM_TYPE_AGGREGATE case, but in pmdaFetch the call to __pmStuffValue
unconditionally uses the value 0 for aggr_len and there is no way for
the pmdaFetchCallback method to return a value for aggr_len.
This has not been a problem to date because
a. PM_TYPE_AGGREGATE is rarely used, and
b. in the places where PM_TYPE_AGGREGATE is used it is either not in a
PMDA, or in a PMDA that over-rides the pmdaFetch method.
Here are my notes on possible fixes and implications thereof ... I'm
voting for Option 3 and would appreciate feedback from others.
Option 1.
Make pmdaFetchCallBack variadic
typedefaggrlen
int (*pmdaFetchCallBack)(pmdaMetric *, unsigned int, pmAtomValue *, ...);
so that aggr_len could be returned as needed for PM_TYPE_AGGREGATE
metrics.
Causes compilation warnings with current code that uses the old typedef
and is pretty obfuscated.
Option 2.
Change the pmdaFetchCallBack definition to
typedef
int (*pmdaFetchCallBack)(pmdaMetric *, unsigned int, pmAtomValue *, int *);
with the extra argument being a pointer to aggr_len, so that the
callback can set aggr_len when the metric type is PM_TYPE_AGGREGATE,
then aggr_len is used in the call to __pmStuffValue in pmdaFetch()
Breaks no more code than Option 1, but is much clearer. But still 100%
ugly.
Option 3.
Change __pmStuffValue to drop the aggr_len parameter and change the
format for PM_TYPE_AGGREGATE accessed from a pmAtomValue so that
atom->vp points to a pmValueBlock which is dynamically allocated and the
vlen field there gives the length.
We need some simple arithmetic to calculate the length of the aggregate
and the address for the start of the aggregate (both skipping over the
pmValueBlock header).
This would make the format for a PM_TYPE_AGGREGATE value passed to
__pmStuffValue the same as for PM_TYPE_AGGREGATE_STATIC and
PM_TYPE_EVENT.
We could even change the definition of vp in pmAtomValue from void * to
pmValueBlock *, and rename it to vbp (because there is no other use for
the void * pointer).
This would break any caller of __pmStuffValue and anyone using the
current PM_TYPE_AGGREGATE pmAtomValue encoding regime.
Specifically this means ...
__pmStuffValue() in libpcp
this is a place we're aiming to fix
pmdaFetch() in libpcp_pmda
this is a place we're aiming to fix
rewrite() in pmlogreduce
aggr_len is 0 in the call to __pmStuffValue(), so no problem
[although the code seems broken in that the PM_TYPE_AGGREGATE
cases will produce bad output and the PM_TYPE_EVENT case will
cause pmlogreduce to abort when pmExtractValue() fails ahead
of the __pmStuffValue() call]
probably best to ban PM_TYPE_AGGREGATE* and PM_TYPE_EVENT for
pmlogreduce
do_preamble() in pmlogger
aggr_len is 0 in the call to __pmStuffValue() and the types
are integers or strings, so no problem
pmstore
aggr_len is calculated for a string variable that is used as a
temporary for the "(none)" old value when no values available
for the previous value ... aggr_len is ignored in __pmStuffValue()
for string types (strlen() is called), so no problem dropping the
length calculation here
second use is when building new pmResult, and the calculation of
the value length is not needed for most types, and is a total crock
for the PM_TYPE_AGGREGATE case.
probably best to ban PM_TYPE_AGGREGATE* and PM_TYPE_EVENT for
pmstore(1)
fillResult() in dbpmda
store is not supported for PM_TYPE_AGGREGATE* and PM_TYPE_EVENT,
so no issues here ... although nbyte is needlessly calculated before
calling __pmStuffValue() for all the other types
weblog PMDA
aggr_len is always 0, so no problem
trace PMDA
aggr_len is always 0, so no problem
sample PMDA
will need to rework PM_TYPE_AGGREGATE and PM_TYPE_EVENT metrics
hotproc PMDA
?
pmcd PMDA
aggr_len is always 0, so no problem
shping PMDA
aggr_len is always 0, so no problem
linux, darwin and lustrecomm PMDAs
all need some dead code removal for PM_TYPE_AGGREGATE*
|