On 17/08/16 06:49, Frank Ch. Eigler wrote:
> ...
G'day Frank,
I agree on the strace for PMDA not pmcd comments.
...
Come to think of it, there are few PMDAs that have NOT been hit by
this issue at some point. I wonder if it's time that a more systemic
solution be invented (not just restarting timed-out pmdas).
But I think this assertion is not correct ... there are in fact very few
PMDAs that have hit this issue, specifically there are 81 PMDAs in the
current source tree and very few of these have triggered PDU timeout
issues for pmcd. The most notable and long-standing cases are the DBMS
PMDAs where SQL queries are used.
And the "solution" is a standard one ...
If the source of the metrics cannot answer the "gimme the values"
request from pmcd in less than 5 seconds then that source cannot pretend
to be able to deliver real-time data (which is the basic assumption in
the way pmcd interacts with clients and PMDAs).
If this is the case, then the PMDA developer must adopt a multi-threaded
caching approach where one thread is timer driven and periodically
updates the cache of metric values while another thread is PDU driver
and services requests from pmcd using the most recently cached values.
This is a standard template that does not touch any of the PCP APIs.
This approach reduces the quality of the data (in terms of timeliness)
and adds overhead (the refreshing thread runs even if no client of pmcd
is requesting the data). And for these reasons this is not the
preferred PMDA architecture if it can be avoided.
|