Hey,
Nathan Scott <nathans@xxxxxxxxxx> writes:
[...]
> Fabulous - awesome effort esp. on the QA front. I ran out of time to
> review it all today (maybe someone else will?), but did sneak a quick
> background QA run in on RHEL 6 today. I'm seeing a few new failures
> there - see attached .bad files - any ideas on possible root causes?
Thanks for looking things over. I've pushed fixes for 3/4 of the
failures (diffstat and change updates below). For 967 and 813 the issue
was a difference in pmid's for the TOT_INS metric. This was due to the
dynamic pmns. Functionally, the tests were identical (and running
properly), to fix this I added an additional regex to match the
papi.system.* pmid's to output them as 126.0.NUMBER.
Testcase 903 was failing partly due to the dynamic pmns as well.
Apparently the number of metrics on that box was much lower, so it
didn't trigger the regex (which would have swapped the number for an
'X'). After testing it on a vm with no papi metrics available, I
lowered the regex to match 7 or greater. This provides matches for the
5 papi.control metrics, 1 papi.available metric, and at least one actual
papi.system.* metric.
Testcase 799 failed for the same reason I mentioned in my original
email, and I'd be open to advice on how to fix it. The metrics I used
to force a ECNFLCT (if multiplexing is disable) on my machine, may not
exist on other machines. Being able to find a combination of metrics
which would cause such an error, programatically, on the host qa
machine, is something I'm not sure how to do yet.
> Only other general piece of advice I can offer would be "release early,
> release often" - the first commit here is >1 month old, and it probably
> coulda been merged right away? *shrug* ... either way is fine, but I'd
> go for quicker, smaller merges every day.
Understood, I'll try to do so more often. Diffstat and commit updates
relevant to above posted below.
Cheers,
Lukas
--------------------------------------------------------------------------
qa/813 | 1 +
qa/813.out | 6 +++---
qa/903 | 2 +-
qa/967 | 1 +
qa/967.out | 26 +++++++++++++-------------
5 files changed, 19 insertions(+), 17 deletions(-)
Author: Lukas Berk <lberk@xxxxxxxxxx>
Date: Thu Nov 13 16:12:37 2014 -0500
Alter qa/903 awk statement to account for lower possible metric counts
The number of available papi metrics varies based on the system being
run on. Previously there would be a pmid for each possible metric, so
we could set the awk regex much higher. At this point, limit it to 7
or greater, (one for each papi.control and one papi.available).
commit b44c3c0decfcfbff9b4ca315bea2f9cf354fdfae
Author: Lukas Berk <lberk@xxxxxxxxxx>
Date: Thu Nov 13 16:10:42 2014 -0500
Update qa testcases to account for dynamic papi pmns
The papi.system.TOT_INS metric is used in both qa/813 and qa/967
testcases. With the dynamic pmid's used, this metric may change
based on the hardware it's run on. Due to this, add a new
regex to that matches 126.0.NUMBER, instead of a specific pmid
commit 629bc4ccaf3328c50d3d8b87cb176a60e3dcccb6
Author: Lukas Berk <lberk@xxxxxxxxxx>
Date: Wed Nov 12 18:55:21 2014 -0500
Add additional qa test that papi.control overrides auto_enable timeout
qa/967 tests that we can disable the auto_enable metric and use pmdapapi
as previously expected. We now add that despite having a timeout (for
the testcase's purposes we use a small one), papi.control.{enable,disable}
takes higher priority and will allow counters to remain active even after
the auto_enable timeout has been hit.
|