pcp
[Top] [All Lists]

pmlogextract optimization

To: pcp@xxxxxxxxxxx
Subject: pmlogextract optimization
From: Martins Innus <minnus@xxxxxxxxxxx>
Date: Wed, 23 Sep 2015 16:16:07 -0400
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
Hi,
For a while we have been having very slow performance with pmlogextract when processing archives with lots of "proc" information. So a large number of changing instances. From profiling, >90% of time was spent in pmGetInDomArchive.

From my analysis, this code path was hit when:

- the config listed metrics, but not instances to filter
- pmGetInDomArchive is called by gram.y -> dometric
- a list of all instances is generated
- metriclist.c -> searchmlist uses this list to compare against to see if the instance should be passed through


Since the same archive that generated the list is being compared against that list, the test will always pass. I assume this was done in order to use the same code regardless of whether or not instance filtering was desired.

Here is a proposed optimization to short-circuit this step:

https://github.com/ubccr/pcp/tree/pmlogextract

We noticed a speedup of 10x-100x depending on the archives processed. Processing time went from minutes to seconds for many archives.

All QA in the pmlogextract group passes with this change.

Martins





commit 7ddf5dbbd8f1ea9eb682a49d991a3b274bad2d95
Author: Martins Innus <minnus@xxxxxxxxxxx>
Date:   Wed Sep 23 19:30:33 2015 +0000

    pmlogextract optimization

    If we want all instances, don't build a list of all instances to
    compare against.  Just pass through all instances.

 src/pmlogextract/gram.y       | 23 ++++-------------------
 src/pmlogextract/metriclist.c | 21 +++++++++++++--------
 2 files changed, 17 insertions(+), 27 deletions(-)

<Prev in Thread] Current Thread [Next in Thread>
  • pmlogextract optimization, Martins Innus <=