Changes committed to git://git.pcp.io/kenj/pcp master
Ken McDonell (10):
qa/709: notrun for any PCP_PLATFORM other than Linux (pmcollectl)
qa/666 & qa/common.check: handle broken Debian valgrind
qa/admin/pcp-daily: re-enable valgrind group on Debian stretch hosts
qa/578: increase tolerance for expected openfd values
qa/914: notrun if there are no real hardware counters here
qa/870: (new) test integrity of pmlogger control files
qa/381: additional diagnositics for debugging
qa/956: additional diagnositics for debugging
src/include/pcp.env: Mac OS X change
src/pmlogger/src/ports.c: fix broken logic for primary control file
qa/381 | 14 ++-
qa/578 | 21 +++--
qa/578.out | 12 +--
qa/666 | 3
qa/709 | 10 ++
qa/870 | 173 +++++++++++++++++++++++++++++++++++++++++++++++
qa/870.out | 7 +
qa/914 | 8 +-------
qa/956 | 4 -
qa/admin/pcp-daily | 5 -
qa/common.check | 12 ++-
qa/group | 1
src/include/pcp.env | 9 +-
src/pmlogger/src/ports.c | 119 ++++++++++++++++++++++++++------
14 files changed, 350 insertions(+), 48 deletions(-)
Details ...
commit e607bbc64a18e7ad8c50503341dd3119231804e7
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri Jul 22 06:48:38 2016 +1000
src/pmlogger/src/ports.c: fix broken logic for primary control file
This was the root cause of the qa/1108 failures.
The logic that checked for and stopped more than one primary pmlogger
from running was broken. Specifically using stat() instead of
lstat() to check for a symbolic link will always fail, which drove
us down the "old-style hardlink" path and unconditionally removed
$PCP_TMP_DIR/pmlogger/primary before the existance check that was
intended to stop multiple primary loggers from running.
This error seems to have been introduced in commit 7148bf11 (almost
12 months ago) ... sigh.
And to compound the problem, a primary pmlogger was conditionally
removing $PCP_TMP_DIR/pmlogger/primary at exit, meaning that if we
ever got 2 (or more!) primary pmloggers running and either of them
exited the control files would be removed and pmlogger_check would
stumble along later and start another primary pmlogger running.
So now we are checking the pid from the symlink and only removing
the primary control file if this instance of pmlogger created it.
Also cleaned up some misleading diagnostics.
commit 7ca4c81e25425aa592a0b853e1bebb55843031e2
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri Jul 22 06:46:15 2016 +1000
src/include/pcp.env: Mac OS X change
In _get_pids_by_name() we need to also accommodate ps(1) output that
has the executable name enclosed in () ... this was causing QA failures
for qa/956 on Mac OS X.
commit d4858c9de1ff9dc86601cbc42f5633e94ed17f58
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri Jul 22 06:45:03 2016 +1000
qa/956: additional diagnositics for debugging
commit dc6dfd1ff23b5102f147e8a87f09502ffe4f6150
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri Jul 22 06:44:30 2016 +1000
qa/381: additional diagnositics for debugging
commit 4a9298eab7b86504f3287c2386483efde17fa663
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri Jul 22 06:32:40 2016 +1000
qa/870: (new) test integrity of pmlogger control files
These are the ones in $PCP_TMP_DIR/pmlogger. And getting this test
to pass will address the root cause of the non-deterministic qa/1108
failures.
This test can be run with a --check argument which silently (if all
is well) runs the integrity check without any of the test cases.
In this form, could be used with check.callback to run the check
after every test to help identify any test that leaves the control
files in a bad state.
commit 00ae066eedfaa1ef971a15266ffb00733e997b9b
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Wed Jul 20 11:12:33 2016 +1000
qa/914: notrun if there are no real hardware counters here
The PAPI PMDA may have been built, but the platform may be lame
hardware or a crippled VM with no support for hardware counters.
commit 6c58b9e89dbf04d67d991831a1f61e4ed24281fd
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Wed Jul 20 09:58:28 2016 +1000
qa/578: increase tolerance for expected openfd values
Based on a suggestion from Nathan that the failures in this test
may be related to non-determinism coming from the recently added
parallelism in the socket connection code, change the filtering to
accept +/-1 from the (previously) expected value.
commit fe6f79f6af659b63e105413ed8d8e472b5c54ebe
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Wed Jul 20 09:39:10 2016 +1000
qa/admin/pcp-daily: re-enable valgrind group on Debian stretch hosts
commit 3156256a4b85eeefde4b515f3ed1b38c85c4b098
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Wed Jul 20 09:37:49 2016 +1000
qa/666 & qa/common.check: handle broken Debian valgrind
Filter out bogus lines from the current Debian stretch version
of valgrind.
commit f22f7a9d60a381ce8e647f798d2ed139b5437a97
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Tue Jul 19 20:12:27 2016 +1000
qa/709: notrun for any PCP_PLATFORM other than Linux (pmcollectl)
|