pcp
[Top] [All Lists]

Re: [pcp] Dynamic Proc PMDA metrics

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] Dynamic Proc PMDA metrics
From: Martins Innus <minnus@xxxxxxxxxxx>
Date: Thu, 13 Nov 2014 15:01:11 -0500
Cc: pcp <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <1453660478.13246038.1415858528343.JavaMail.zimbra@xxxxxxxxxx>
References: <545D2CFA.1050600@xxxxxxxxxxx> <1477043892.10880496.1415600819766.JavaMail.zimbra@xxxxxxxxxx> <54627B7D.6040906@xxxxxxxxxxx> <2023629481.12445911.1415770455874.JavaMail.zimbra@xxxxxxxxxx> <5463C8C8.5010700@xxxxxxxxxxx> <1453660478.13246038.1415858528343.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
Nathan,

On 11/13/14 1:02 AM, Nathan Scott wrote:
Still no progress on this, except that with a dso pmda, the domain
number is correct at this point. Illustrated below in the interrupts case.
Looks like a bug - in the cgroups case, the domain# is explicitly
stamped into the PMID for the new metric table entry each time, so
I suspect we've never seen this before as a result.
OK.  I haven't dug too far to see where this should happen.

While researching this, I've run into a few issues that stem from
calling pmdaDynamicPMNS multiple times:

1. In the "size_metrictable" callback there is no context as to which
dynamic tree is being queried if we use the same callback for all trees.
(*nod* - use different callbacks?)  Thats the old cgroups model anyway.
The PMID cluster# was the point of division used there - each cluster is
managed separately IIRC.
OK.

So this is going to way overestimate the storage space needed, since we
need to return the size of the largest sub tree if we use a common
callback.  The alternative is one callback per sub tree as far as I can see.
Yep.  And that's OK - there's not many proc.* sub-trees (4/5?).
Sounds good. I can do that.

2. This is a larger problem. In dynamic.c, for the function:

static pmdaMetric *
dynamic_metric_table(int index, pmdaMetric *offset)

      When running through the initial metric table, the only check that
is done is to see if the cluster matches when doing "mtabupdate" and no
check is done against the item or name.  This will cause multiple calls
for the same metric if we have clusters that span multiple trees (since
If we take away "clusters spanning trees" this all gets simpler I think?
Am I oversimplifying things there?  (apologies if so - I'll look at it
more closely as soon as I get this cgroup stuff finished).
That's exactly correct. The existing dynamic code assumes clusters don't span trees. This problem goes away if they are distinct.

This assumption is also made in the following methods:

pmdaDynamicLookupPMID
pmdaDynamicLookupText

These will use the first tree that has the matching cluster.

Actually. Thinking this through. Something like the attached patch could fix this issue once we can get the domain # in the source metric. I can't really test it until the domain # is there since the search fails currently. Not sure of any performance issues with all those searches.

I did find that cgroup.c doesn't call pmdaTreeRebuildHash while testing this. I put a call into the "namespace" function, but it likely doesn't mesh with your recent changes. I just put something in to get it to run for now.


3. In general, its not clear to me the proper way to generate the
dynamic metrics from the initial "templates". For example, take the
Yeah, the code here is pretty freaky and non-obvious as I recall.

      /* kernel.percpu.interrupts.line[<N>] */
      { NULL, { PMDA_PMID(CLUSTER_INTERRUPT_LINES, 0), PM_TYPE_U32,
      CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) }, },
/* kernel.percpu.interrupts.[<other>] */
      { NULL, { PMDA_PMID(CLUSTER_INTERRUPT_OTHER, 0), PM_TYPE_U32,
      CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) }, },

These are used as templates to create the dynamic metrics. But
interrupts.c provides in "size_metrictable" the total number of dynamic
metrics, so we end up with at least 2 extra metrics which as far as I
can tell are not used.
That was not the original intention, and sounds like a bug - it was
intended this would be allocating the correct amount of space, but
it looks like there's a few extras.  It'd be good to dig more deeply
into that and understand why.
I looked into this a bit more. Its 2 different issues. The first is just an off by one error in the pmda. The pmda should just allocate N-1 new slots for each group, since the original "template" is used for metric 0. This can be fixed in the linux pmda pretty easily. The second is another instance of the allocation problem in "size_metrictable". Since the trees are unbalanced, it allocates the max of the 2 options. And we end up with one extra slot in the smaller tree.

So both of the .11 and one of the .10 metrics is orphaned?  Does that
cause any issue?  Or since the pmns doesn't reference them, no harm
except memory used.  If that's the case,  for my issue #2 above I can
Yeah, I think that's the case.

just set dummy cluster or item values for the metrics that are sent
through twice(or more), but that's pretty wasteful and I'm sure will
have other unintended consequences.
Not that I've seen to date, but yep, we should certainly fix that up.

OK.  This should be sorted out if we fix the above.

Martins

Attachment: dynamic_c.patch
Description: Text Data

<Prev in Thread] Current Thread [Next in Thread>