Hey Frank,
Thanks for taking a hack at this.
While taking a look at your patch, imo the papi.c portion is a good
step of robustifying the fetch mechanism. However, I'd like to go a
step further and eliminate the warnings completely from the installation
of the pmda.
How about the following small patch in addition to yours? It removes
any warnings on startup, as we don't enable all the metrics we probe
initially, and then returns the default to 120 seconds. The qa would
need updating again, which I can provide. I would also like to remove
the phrasing that warnings are reasonable in the testsuite, imo in this
case, they aren't.
On a side note, I'm not sure what the changes in pmdapapi.1 are supposed
to mean? Are they mistakes? "#true -- ..." ?
Cheers,
Lukas
diff --git a/src/pmdas/papi/Install b/src/pmdas/papi/Install
index cdad731..e8cf05e 100644
--- a/src/pmdas/papi/Install
+++ b/src/pmdas/papi/Install
@@ -28,6 +28,8 @@ pmdaSetup
# be activated at the same time, even with multiplexing: the excess will
# come back with PM_ERR_VALUE.
pmdaInstall
+# Now allow for auto_enable functionality after iterating through metrics
+pmstore papi.control.auto_enable 120 >/dev/null
# Shut down any counters auto-enabled during the pmdaInstall step
pmstore papi.control.reset "" >/dev/null
diff --git a/src/pmdas/papi/papi.c b/src/pmdas/papi/papi.c
index 16474ca..a37f60f 100644
--- a/src/pmdas/papi/papi.c
+++ b/src/pmdas/papi/papi.c
@@ -57,7 +57,7 @@ typedef struct {
} papi_m_user_tuple;
#define METRIC_ENABLED_FOREVER ((time_t)-1)
-static __uint32_t auto_enable_time = 120; /* seconds; 0:disabled */
+static __uint32_t auto_enable_time = 0; /* seconds; 0:disabled */
static int auto_enable_afid = -1; /* pmaf(3) identifier for periodic callback
*/
static int enable_multiplexing = 1; /* on by default */
fche@xxxxxxxxxx (Frank Ch. Eigler) writes:
> Please see pcpfans.git fche/papi for a possible solution to the first
> problem Nathan's coverity note.
>
> commit d87918b71cd9d0b0a327325c80bf54105c7027e9
> Author: Frank Ch. Eigler <fche@xxxxxxxxxx>
> Date: Tue Feb 24 19:49:21 2015 -0500
>
> papi pmda: improve papi auto-enable & batching, related error handling
>
> A coverity report [1] indicated problems in the way errors from a
> sequence of operations were being (not) propagated out properly.
> Defining what's proper is not easy, because when reading N counter
> metrics, especially in auto-enable mode, may involve partial
> failures. The recently added refresh-batching logic made this aspect
> of the situation more confusing.
>
> In the partial failure scenario, some counters give values, others
> lack them (never having been activated), yet others fail (an
> activation having been rejected). Now, per-counter fetch status is
> separated into those categories at papi_fetchCallBack time: success,
> vs. PMDA_FETCH_NOVALUES vs. PM_ERR_VALUE.
>
> This is accomplished by always batching a single refresh_counter() at
> the beginning of a papi_fetch(), so that by the time the individual
> counters are fetched, we know whether they were requested or not, and
> whether those requests succeeded. This also allows first-time fetches
> of auto-enabled counters to carry proper initial values (so no more
> "no values ..." initial warnings from a pmval over an auto-enabled
> papi counter). The papi.control.batch metric is no longer used and so
> eliminated.
>
> [1] http://oss.sgi.com/archives/pcp/2015-02/msg00081.html
|