pcp
[Top] [All Lists]

pmlogextract Indom Corruption

To: pcp@xxxxxxxxxxx
Subject: pmlogextract Indom Corruption
From: Tom Yearke <tyearke@xxxxxxxxxxx>
Date: Mon, 02 Dec 2013 14:05:40 -0500
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.0.1
Hello,

We are currently experiencing a problem with pmlogextract producing archives with corrupted instance domain definitions. Comparing the output of pmdumplog -i run on some original logs versus the output when run on portions of those logs created with pmlogextract, where there are periods of no change to the instance domain definitions in the original logs, there can be many changes in the partial logs. These extra definitions can both be missing certain instances and adding other instances from non-included portions of the original log. This can make automatic summarization and analysis of these partial logs difficult to perform accurately when dealing with metrics with frequently-changing instance domains, such as the proc.* metrics.

I've attached a small example that demonstrates these problems. node_archive is the original log run through pmlogrewrite to make the issue clearer to see (the corruption occurs in the same way when pmlogextract is run on the unfiltered original logs), and extract_archive is the result of running this command:

TZ="UTC" /usr/libexec/pcp/bin/pmlogextract -S "@ Nov 27 15:52:30" -T "@ Nov 27 15:54:30" node_archive extract_archive

At 10:53:44 log time, there is no change to the instance domains in the node archive. However, in the extracted archive at that time, process 1533 has been removed, despite having a value at that time, and process 1503 has been added, even though 1503 was removed before the time window given to pmlogextract. In this example, these two problems occur at the same time, but we have seen other extracted logs where one of the problems affects instance domains at some timestamps and the other problem affects instance domains at other timestamps.

If someone could take a look into what might be causing these issues and provide any advice, we would be very appreciative.

Thank you!

Tom Yearke

Attachment: pcp_archives.zip
Description: Zip archive

<Prev in Thread] Current Thread [Next in Thread>