Hello PCP community,
Customers have been requested us to make the pmda cluster more dynamic:
1) Right now we use a static configuration file and the compute nodes
always push all metrics that are in that file.
2) The compute nodes push the metrics to the cluster head node every two
seconds even if no client is requesting information.
After looking at the code and some brainstorming here at SGI, I wrote
the following proposal that I need to discuss with you. I value a lot
your feed-back so I need to ask you the favor to analyze it and provide
me your thoughts.
Thank you,
Cornel
-----------------------------------------------------
Overview
The Cluster PMDA has two components: pmdacluster that runs on the
cluster head node (the PMDA) and pmclusterd that runs on compute nodes
(provides the data). The Cluster PMDA configuration files contains the
list of all metrics that are required for pmclusterd to retrieve every
two seconds.
PROPOSAL
* The PMDA will load the list of supported metrics including the update
interval in ms for each one of the metrics. 2 seconds can be used as a
default value when a specific refresh interval is not defined. Some
metrics will not change their values (e.g. memory size and other metrics
that are related to hardware). Some other may change very often (e.g.
network traffic counters).
* In general the protocol between pmdacluster and pmclusterd will remain
the same with the exception that metrics will be pushed only when
requested based on their refresh interval (not every 2 seconds as right
now).
* First time a pm_fetch() call is made (from pmcd), the pmdacluster will
retrieve the values from pmclusterd and cache them. The time stamp of
the request will be saved in the cache together with the values for each
metrics.
* When the next pm_fetch() comes, the pmdacluster will check the cache
and provide the data from there if the data has not expired (saved time
stamp + refresh interval < current time stamp). If the data has expired
for one or more metrics, it will retrieve new data for those metrics and
save the new time stamp.
* The pmdacluster will start requesting periodically new data for the
metrics for which the time between the two pm_fetch() calls is less or
equal than twice the value of the refresh interval so it always has data
available when the next pm_fetch() call will be made.
* If no pm_fetch() comes for a metric after three refresh interval
expire, the metric will not be requested anymore.
The following are some assumptions, ideas, questions/answers that I
copied from some emails that I exchanged with some of my colleagues at
SGI.
1) The 2 second default will be tunable.
2) Pushes for different metrics will be grouped.
3) We will define the refresh interval for each metric meaning that we
will specify how long a value can be returned as it is (how long we
think the value stays fresh). If we get the second request after the
value has spoiled we will start asking for values so we have a fresh
value when the third request comes. We will stop collecting data if more
than twice the time between first request and second request has passed
and a third request has not come yet. We can make this configurable.
4) I will monitor all requests (possible from multiple clients) and
refresh the data to accommodate the slowest pulling client.
5) To not disrupt PMCD, pmdacluster will return "no value available"
error the first time we are queried if it takes too much time to get
the metrics from pmclusterd (e.g. longer than 5 sec).
6) The API, and some tools, allow querying individual instances. We
still need to get instance domain info from each client when they
connect. This volume of initial setup traffic could be an argument for
limiting intentional client disconnects, which is currently used as a
method to change the list of metrics that should be pushed.
|