pcp
[Top] [All Lists]

Re: [pcp] PCP / Zabbix Agent Loadable Module

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] PCP / Zabbix Agent Loadable Module
From: Marko Myllynen <myllynen@xxxxxxxxxx>
Date: Thu, 19 Nov 2015 14:19:13 +0200
Cc: pcp developers <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <851861832.14793166.1447809893203.JavaMail.zimbra@xxxxxxxxxx>
Organization: Red Hat
References: <563099A2.8040901@xxxxxxxxxx> <852045589.5144136.1446765609785.JavaMail.zimbra@xxxxxxxxxx> <5640F0C6.1080801@xxxxxxxxxx> <851861832.14793166.1447809893203.JavaMail.zimbra@xxxxxxxxxx>
Reply-to: myllynen@xxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
Hi,

On 2015-11-18 03:24, Nathan Scott wrote:
> ----- Original Message -----
>> On 2015-11-06 01:20, Nathan Scott wrote:
>>>>
>>>> - pmNewContext(PM_CONTEXT_LOCAL, "") of course works as expected from
>>>> standalone clients but seems to fail from the DSO. This means that the
>>>
>>> Can you paste the failure message somewhere?  (in case I can't get it
>>> to work, below)  Off the top of my head I don't know why it would fail
>>> in local context mode.
>>
>> this might be something completely Zabbix related after all, I tried
>> with the two attached tests which seems to work just fine.
> 
> I think its the same problem as you would see with "pminfo -L" (compared
> to, say, "pminfo -L hinv" and "pminfo -n /var/lib/pcp/pmns/root_linux) -
> where PM_ERR_NOAGENT is ultimately returned from pmTraversePMNS(3) - at
> least, that's what was happening to me in the end.

right, not sure how I managed to overlook this.

> A zbxpcp.so configuration file with explicit metrics (rather than trying
> to expand all metrics possible via a local context) would resolve this.
> But, using pmcd avoids the above and is the best way to get all available
> metrics from all installed PMDAs into Zabbix, so perhaps we should use it
> unconditionally.

Yeah, let's just drop it, makes things simpler.

> I assumed there would be headers and a shared library which the zbxpcp DSO is
> including+linking with, which talks to zabbix_agent.  But, thats not entirely
> the case.  Theres no shared library to link, its all macros and (very simple)
> data structures in the headers.  Pretty much one header includes everything,
> and the structures are so simple that we could even consider a local copy (in
> PCP I mean) since they aren't shipped in any rpm & are license compatible.
> 
> Maybe.  It would have the endearing property of working out-of-the-box after
> a PCP install (provided the Zabbix API is stable, which it seems to be), as
> well as being very simple in the PCP build - no configure-scripting and no
> conditional-enablement needed.

I think this might be an acceptable approach, with inttypes.h + module.h
we'll have all we need and if the API ever changes then module.h will be
out of date but if we want to use the possible newer API we'd need more
substantial code changes anyway.

> Yep - its all setup now and I see PCP values.  There seems to be a bit of
> a problem with instances - e.g. pcp.filesys.* - only the one instance is
> ever reported (instance ID zero) - is that the same for you?

It seems to work ok here:

[root@rhel-6-server ~]# pminfo -f filesys.mountdir

filesys.mountdir
    inst [0 or "/dev/dm-0"] value "/"
    inst [1 or "/dev/vda1"] value "/boot"
    inst [2 or "/dev/vdb"] value "/mnt/vdb"
[root@rhel-6-server ~]# zabbix_get -s 127.0.0.1 -p 10050 -k
pcp.filesys.mountdir[/dev/dm-0]
/
[root@rhel-6-server ~]# zabbix_get -s 127.0.0.1 -p 10050 -k
pcp.filesys.mountdir[/dev/vda1]
/boot
[root@rhel-6-server ~]#

> There's some things we can improve in the way we're communicating with pmcd
> here, and a bunch of smaller details we'd need to figure out (tutorial, QA,
> man page, where to install a Zabbix module to, packaging, the timeout value,
> need to know a bit more about the Zabbix security model, diagnostics, and so
> on ...).

Improving pmcd communication sounds interesting, what did you have in
mind, client side caching or something else?

I assumed timeout is not needed as pmcd kills unresponsive PMDAs?

What aspects of Zabbix security model are relevant here? Anything the
user gives as an instance will go to pmLookupName() which should handle
unsanitized input already and if the agent is properly configured then
it can be queried from the trusted server only.

Diagnostics are a bit tricky as there seems to no method to log anything
properly from the module, messages to stdout will end up to the agent
log from initialization functions. So far in use the current diagnostics
seem to have been working fairly well, though.

> But I think its all doable - with a bit of work, this seems to me like it is
> definitely something we could include in PCP, if you are still keen to see
> this one merged Marko?

Sure, let's try to get it done (with the caveat that I'll try prioritize
the pmrep work in the coming days).

Thanks,

-- 
Marko Myllynen

<Prev in Thread] Current Thread [Next in Thread>