pcp
[Top] [All Lists]

RE: pmlogger: fetch.c: changed & PMCD_ADD_AGENT:

To: "'Dave Brolley'" <brolley@xxxxxxxxxx>
Subject: RE: pmlogger: fetch.c: changed & PMCD_ADD_AGENT:
From: "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx>
Date: Tue, 21 Jun 2016 17:19:49 +1000
Cc: "'PCP Mailing List'" <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <574C98EE.9010504@xxxxxxxxxx>
References: <574C98EE.9010504@xxxxxxxxxx>
Thread-index: AQJDvApKWauIx3825aCWW6aSt6bJcZ8Pnnog
Apologies Dave for not responding earlier ...

#include <stdexcuses.h>

> -----Original Message-----
> From: Dave Brolley [mailto:brolley@xxxxxxxxxx]
> Sent: Tuesday, 31 May 2016 5:48 AM
> To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
> Cc: PCP Mailing List <pcp@xxxxxxxxxxx>
> Subject: pmlogger: fetch.c: changed & PMCD_ADD_AGENT:
> 
> Hi Ken,
> 
> I'm working on http://oss.sgi.com/bugzilla/show_bug.cgi?id=1100 and
> have a couple of questions for you.
> 
> 
> 1.    I'm looking at the code in src/pmlogger/src/fetch.c where 'if
> (changed & PMCD_ADD_AGENT)' is handled. It seems to me that this test
> which adds a mark record in the case a pmda (re)starts (outside the
> loop which handles the received pdus) is too late, since pdus
> representing several potentially changed/reset metrics may have been
> erroneously logged before then. Should the test not be made, and the
> mark record not be generated inside the loop at the point at which the
> change is first noticed?

I think the code is wrong.

The protocol is ...
-> client send request
<- pmcd sends error pdu with error code > 0 AND
<- pmcd sends response pdu

So __pmGetPDU loop in fetch.c gets and decodes these in the correct order, but 
the test and putmark() call should be inside the loop before the pmResult is 
output.

> 2.    I'm considering two possibilities for checking the consistency of
> the PMNS+metadata:
> 
>       1.      Check the consistency of all metrics in all task list items
> at this point
> 
>               *       pro: It gets the check out of the way with no
> additional infrastructure at the point of the change
>               *       con: It might be too much work to do in shot at this
> point in the fetch/log cycle?
> 
> 
> 
>       2.      Check the consistency of metrics are they are fetched later
> 
>               *       pro: less work now and allows the current log to
> continue until an inconsistent metric is actually to be fetched
>               *       pro: pmlogger may potentially continue indefinitely,
> since inactive metrics may never be flagged
>               *       con: error may be harder to relate to the actual
> event, since it may be detected much later
>               *       con: some additional infrastructure needed to flag
> those metrics which still need to be checked

I'd vote for Plan 1. (which I understand you've already decided on).

Just one thing, what sort of "consistency" checks are you planning here?  And I 
assume these are hiding behind the putmark() guard, correct?

<Prev in Thread] Current Thread [Next in Thread>