pcp
[Top] [All Lists]

Re: [pcp] Handling Oracle PMDA Latencies

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] Handling Oracle PMDA Latencies
From: Marko Myllynen <myllynen@xxxxxxxxxx>
Date: Thu, 24 Mar 2016 06:34:17 +0200
Cc: pcp developers <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <874656864.33839965.1458781570723.JavaMail.zimbra@xxxxxxxxxx>
Organization: Red Hat
References: <56F25541.9020602@xxxxxxxxxx> <874656864.33839965.1458781570723.JavaMail.zimbra@xxxxxxxxxx>
Reply-to: Marko Myllynen <myllynen@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0
Hi,

On 2016-03-24 03:06, Nathan Scott wrote:
> 
> There are many possible root causes for this domain instability.  We need to
> do root cause analysis and understand the issues properly to know how best to
> proceed in each case.
> 
> It's not helpful to paper over this kind of problem with long timeouts or "use
> more threads" or add code that returns PM_ERR_SORRY_I_CANT_HELP_YOU_RIGHT_NOW
> for the duration of the problem.  What people need is actual metric values and
> especially so at those difficult times.
> 
> For example, the Intel folk found a v$filestat query that could block for many
> *minutes*, with certain erm extreme database configurations.  This turned out
> to be an issue in Oracle itself, and not anything to do with machine load.

Hmm, ok, so if such latencies are found on further testing you're
basically saying that the answer is "fix Oracle"?

FWIW, I've witnessed some non-PCP related performance metrics fetching
cases where a 3rd party vendor (not Oracle) has been made aware of a
hickup under certain conditions in their application but it's taken
months or even more time for them fixing the issue. So while ideally
fixing the root cause is of course the best approach it's not always
feasible or even possible.

Thanks,

-- 
Marko Myllynen

<Prev in Thread] Current Thread [Next in Thread>