pcp
[Top] [All Lists]

Re: PCP Zabbix Agent PMDA

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: PCP Zabbix Agent PMDA
From: Marko Myllynen <myllynen@xxxxxxxxxx>
Date: Tue, 3 Nov 2015 11:38:39 +0200
Cc: pcp developers <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <y0mh9l4rp52.fsf@xxxxxxxx>
Organization: Red Hat
References: <56309994.8020404@xxxxxxxxxx> <y0mh9l4rp52.fsf@xxxxxxxx>
Reply-to: myllynen@xxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
Hi,

On 2015-11-03 00:51, Frank Ch. Eigler wrote:
> myllynen wrote:
> 
>> [...]  The PMDA was easy enough to implement so I did it anyway and
>> thought to share it here.  [...]
> 
> As a data source, perhaps zabbix is a bit too thin (low
> quality/quantity) to bother pull from, but if we did:
> 
>> [...]
>> +our $host  = '127.0.0.1';
>> +our $port  = '10050';
>> +our $srcip = '';
>> +our $pmda  = PCP::PMDA->new('zabbixagent', 480);
>> +
>> +# Example instance configuration
>> +our $net_indom = 0;
>> +our $vfs_indom = 1;
>> +our @net_insts = sort(split('\n', `ls -1 /sys/class/net`));
>> +our @vfs_insts = sort(split('\n', `awk '/^\\/|tmpfs/ {print \$2}' 
>> /proc/mounts`));
> 
> If the idea is that we'd poll remote zabbix servers too (as $host is
> configurable), then polling the local /sys or /proc files is going to
> give the wrong information.

a remote Zabbix agent (not servers) could be polled (but only one Zabbix
agent per PMDA), and yes, then using local instances would be wrong. But
as the code above and the man page state this PMDA in its current form
should probably considered more like an example or a starting point only.

>> [...]
>> +# Fetch command (could be replaced with direct socket communications)
>> +# 
>> https://www.zabbix.com/documentation/3.0/manual/appendix/items/activepassive
>> +if ($srcip ne '') {
>> +    $getcmd = "zabbix_get -s $host -p $port -I $srcip -k ";
>> +} else {
>> +    $getcmd = "zabbix_get -s $host -p $port -k ";
>> +}
> 
> How fast is this operation generally?  If it's a large fraction of a
> second or more, it'd bog down pmcd and other clients, so we'd have to
> use background threads or some other elaboration.

In the case of localhost it's practically instant (time(1) reports
0.002-0.006s real time for a few zabbix_get(1) test runs). For a remote
host it would of course take longer but I think it'd be more natural to
run the PMDA on the same host as the Zabbix agent and then query PMCD on
that host.

>> +sub zabbix_agent_connection_test {
>> +    $pmda->log("pinging $host");
>> +    my $res = `$getcmd agent.ping`;
>> [...]
>> +
>> +sub zabbix_agent_fetch_callback {
>> +    if (!defined($conn_ok)) {
>> +            zabbix_agent_connection_test();
>> +    }
>> +    return (PM_ERR_NOTCONN, 0) unless $conn_ok;
> 
> In a real deployment, you wouldn't want to give up for a single error
> like that.  How about just attempting the $getcmd all the time, and
> handling the timeout or whatnot error indication with PM_ERR_*?

Right, that was the intention but the "$conn_ok = 0;" in
zabbix_agent_connection_test was extraneous, with that line removed
$getcmd is tried if not connected already (and after adding a check that
the result is not empty in zabbix_agent_fetch_callback $getcmd is tried
again if the Zabbix agent goes away).

>> [...]
>> +    my ($name, $mode) = $q =~ /(.*)\.(.*)/;
>> +    # Reformat the queried item key as needed
>> +    if (exists($insts{$name})) {
>> +            # vfs.fs.size.mode -> vfs.fs.size[mnt,mode]
>> +            $q = $name . "[$insts{$name}[$inst],$mode]";
>> +    [...]
> 
> This may be fine, but I'd be a bit concerned using plain regexps to
> parse data that may not always be the vanilla form we like.  What if
> there are special characters or more dots or quotes or something in
> the name components?

In addition to the indom definitions, this was the other part I was
referring to with my earlier comment about the ugliness of the code.
There are several the Zabbix agent checks [1] where this wouldn't work
at all but since the checks are all a bit different, handling them
elegantly isn't necessarily straightforward.

1)
https://www.zabbix.com/documentation/3.0/manual/config/items/itemtypes/zabbix_agent

>> [...]
>> +$pmda->add_metric(pmda_pmid(0,0), PM_TYPE_STRING, PM_INDOM_NULL,
>> +            PM_SEM_INSTANT, pmda_units(0,0,0,0,0,0),
>> +            'zabbixagent.agent.hostname', '', '');
>> +$pmda->add_metric(pmda_pmid(0,1), PM_TYPE_U32, PM_INDOM_NULL,
>> +            PM_SEM_DISCRETE, pmda_units(0,0,0,0,0,0),
>> +            'zabbixagent.agent.ping', '', '');
>> [... * dozens ...]
> 
> Could this list of metadata be generated by querying zabbix 
> instead of being hard-coded here?

Probably (with zabbix_agentd -p), but currently the list of metrics
serves as a "documentation" what the code supports, without the code
handling the checks properly it wouldn't help much to list all of them here.

Thanks,

-- 
Marko Myllynen

<Prev in Thread] Current Thread [Next in Thread>