pcp
[Top] [All Lists]

Re: [pcp] RFC - pmie "ruleset" extension

To: pcp@xxxxxxxxxxx
Subject: Re: [pcp] RFC - pmie "ruleset" extension
From: Mark Goodwin <mgoodwin@xxxxxxxxxx>
Date: Tue, 24 Jun 2014 11:33:44 +1000
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <53A8D17C.8060808@xxxxxxxxxxxxxxxxxxx>
References: <53A8AA17.5070205@xxxxxxxxxxxxxxxx> <53A8AB4E.9090003@xxxxxxxxxxxxxxxx> <y0mtx7b5f7z.fsf@xxxxxxxx> <53A8D17C.8060808@xxxxxxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0
On 06/24/2014 11:16 AM, Keith Owens wrote:
We have complicated tests with multiple errors, multiples warnings and a final
OK if everything is sane. Nagios expects a single packet every minute or so
giving a return code and a line of text. With the emphasis on 'single'.
Currently I have to resort to repeated test negation to ensure that only one
rule will trigger, it's messy.

Not just messy, but rather inefficient and probably error prone too
because despite the repeated negation, the postfix.queues.* metrics
could change between the multiple rule evaluations (so more than one
rule could trigger). Actually - it depends on whether pmie does one
fetch per rule or one fetch per update cycle(?)

In any case, Ken's syntax extension is a winner :)

-- Mark



delta = 1 min;

held_critical = <%= held_critical %>;
deferred_critical = <%= deferred_critical %>;
incoming_critical = <%= incoming_critical %>;
active_critical = <%= active_critical %>;
active_warning = <%= active_warning %>;

// Any held email is critical

(postfix.queues.hold #total >= $held_critical)
         -> shell "$send_nagios -S POSTFIX -s 2 %v held emails\n";

// Check for deferred emails, critical only

!(postfix.queues.hold #total >= $held_critical) &&
(postfix.queues.deferred #total >= $deferred_critical)
         -> shell "$send_nagios -S POSTFIX -s 2 %v deferred emails\n";

// Check for incoming/maildrop emails, critical only

!(postfix.queues.hold #total >= $held_critical) &&
!(postfix.queues.deferred #total >= $deferred_critical) &&
(postfix.queues.incoming #total + postfix.queues.maildrop #total >=
$incoming_critical)
         -> shell "$send_nagios -S POSTFIX -s 2 %v incoming/maildrop emails\n";

// Check for active emails, critical and warning

!(postfix.queues.hold #total >= $held_critical) &&
!(postfix.queues.deferred #total >= $deferred_critical) &&
!(postfix.queues.incoming #total + postfix.queues.maildrop #total >=
$incoming_critical) &&
(postfix.queues.active #total >= $active_critical)
         -> shell "$send_nagios -S POSTFIX -s 2 %v active emails\n";

!(postfix.queues.hold #total >= $held_critical) &&
!(postfix.queues.deferred #total >= $deferred_critical) &&
!(postfix.queues.incoming #total + postfix.queues.maildrop #total >=
$incoming_critical) &&
!(postfix.queues.active #total >= $active_critical) &&
(postfix.queues.active #total >= $active_warning)
         -> shell "$send_nagios -S POSTFIX -s 1 %v active emails\n";

// Otherwise OK

!(postfix.queues.hold #total >= $held_critical) &&
!(postfix.queues.deferred #total >= $deferred_critical) &&
!(postfix.queues.incoming #total + postfix.queues.maildrop #total >=
$incoming_critical) &&
!(postfix.queues.active #total >= $active_critical) &&
!(postfix.queues.active #total >= $active_warning)
         -> shell "$send_nagios -S POSTFIX -s 0 %v active emails\n";


On 24/06/14 10:22, Frank Ch. Eigler wrote:
kenj wrote:

[...]
In this use case, pmie needs to be able to

(a) emit a message to indicate an alert, else
(b) emit "OK" (or the moral equivalent)
[...]
Is the idea to have pmie normally report OK every polling interval (10
seconds or whatever?)?  Can you say more about the interconnection of
pmie and nagios?

- FChE

_______________________________________________
pcp mailing list
pcp@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/pcp

_______________________________________________
pcp mailing list
pcp@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/pcp

<Prev in Thread] Current Thread [Next in Thread>