pcp
[Top] [All Lists]

Re: [pcp] Apache agent issue

To: chandana@xxxxxxxxxxxxx
Subject: Re: [pcp] Apache agent issue
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Tue, 20 Mar 2012 09:04:37 +1100 (EST)
Cc: pcp@xxxxxxxxxxx
In-reply-to: <4F678FF4.8090207@xxxxxxxxxxxxx>
G'day Chandana,


Hello All,

I am running the apache pmda on a host, and find that the pmda stops
working if apache is restarted. The only workaround that I have found is
to re-start the agent.

Is there some know way of getting apache agent survive across apache
re-starts ?

I am having the same problem with mysql

Yeah, this is not the correct behaviour for either PMDA - they should return an
error (PM_ERR_AGAIN, PM_ERR_NOTCONN or the more generic code
PM_ERR_VALUES) for each requested value while not connected, and should
be attempting to reconnect on each fetch request too.

For mysql, looks like its possible (see DBD::mysql docs on cpan.org) to set
a DBI->connect attribute called "mysql_auto_reconnect" ... that'd be worth a
try (if you could, that'd be great).  If that doesn't work, need to detect failure
via one of the DBI API calls (whichever fails), and "undef $dbh;".  That will
cause a reconnect attempt in all the right places, I think.

Apache will be a different story.  Its a C agent, and uses libpcp_http.so and
the interface there is at the URL level.  Not clear why that call doesn't just fail
while apache is down, and give data again later.  I would guess (but you need to
look in pmcd.log and apache.log to be sure) that pmcd is closing the connection
to pmdaapache due to a timeout.  We'll need to figure out where that timeout is
coming from (which syscall in libpcp_http/src/http_fetcher.c::http_fetch blocks).

As a workaround, its also possible to use pmie to notice when the agent exits
(pmcd.agent.status) and send sighup pmcd (pmstore into pmcd.control.sighup).
But, thats a sledgehammer - be good to understand and fix the above issues.

cheers.

--
Nathan
<Prev in Thread] Current Thread [Next in Thread>