pcp
[Top] [All Lists]

Re: [pcp] Forw: matahari: comparing Sigar and PCP for data gathering

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: [pcp] Forw: matahari: comparing Sigar and PCP for data gathering
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Wed, 6 Apr 2011 20:40:46 +1000 (EST)
Cc: pcp@xxxxxxxxxxx, matahari@xxxxxxxxxxxxxxxxxxxxxx
In-reply-to: <y0m8vvolock.fsf@xxxxxxxx>
----- Original Message -----
> Hi -
> 
> RH folks asked us to analyze how PCP could fit in with matahari
> (https://fedorahosted.org/matahari/), seeing that matahari already had
> a sort-of-similar tool underneath it, Sigar. I undertook to compare
> Sigar (1.6.5, rawhide, http://support.hyperic.com/display/SIGAR/Home)
> to PCP (3.5.0, f15+, http://oss.sgi.com/projects/pcp), in context of
> contemplating switching in matahari from the former to the latter.

There may be a few levels of "how PCP could fit" for matahari - one
other is with a shared toolchain for Windows, which pcp-glider is
tasked with providing for PCP installation on Windows.  Setting up
a working environment on Windows for tools needing access to kernel
internals (so, non-Cygwin) and Win32 API calls is a bit of effort,
there may be some work that could be reused there as well.

> Sigar provides a variety of functions, each of which returns the local
> system's performance/system-information data in structs. PCP provides
> a broader API, where similar data may individually be extracted by
> supplied "metric name" from local or remote hosts.

(or archives, if that matters).

> sigar_proc_stat_t
> 
> typedef struct {
> sigar_uint64_t total; proc.runq.unknown + proc.runq.kernel + others
> sigar_uint64_t sleeping; proc.runq.sleeping
> sigar_uint64_t running; proc.runq.runnable
> sigar_uint64_t zombie; proc.runq.defunct
> sigar_uint64_t stopped; proc.runq.stopped
> sigar_uint64_t idle; proc.runq.blocked
> sigar_uint64_t threads; ? (not used by matahari)
> } sigar_proc_stat_t;
> 
> This is used in matahari src/lib/host.c:host_get_processes() to send
> qpid process_statistics. The fields generally correspond to the PCP
> "proc.runq.*" metrics as listed above. No apparent Windows support on
> Sigar/PCP.
> 
> Source Sigar:src/sigar.c PCP:src/pmdas/linux/proc_runq.c
> 
> ------------------------------------------------------------------------
> sigar_loadavg_t / sigar_loadavg_get
> 
> typedef struct {
> double loadavg[3]; kernel.all.load [1,5,15]
> } sigar_loadavg_t;
> 
> Basic stuff. Not available on windows on Sigar/PCP.
> 

Not available on Windows at all - the kernel doesn't support
the 1/5/15 minute load average concept.

> Source Sigar:src/os/linux/linux_sigar.c
> PCP:src/pmdas/linux/proc_loadavg.c
> 
> ------------------------------------------------------------------------
> sigar_sys_info_t
> 
> typedef struct {
> char name[SIGAR_SYS_INFO_LEN];
> char version[SIGAR_SYS_INFO_LEN];
> char arch[SIGAR_SYS_INFO_LEN]; kernel.uname.machine
> char machine[SIGAR_SYS_INFO_LEN]; kernel.uname.machine
> char description[SIGAR_SYS_INFO_LEN];
> char patch_level[SIGAR_SYS_INFO_LEN];
> char vendor[SIGAR_SYS_INFO_LEN];
> char vendor_version[SIGAR_SYS_INFO_LEN];
> char vendor_name[SIGAR_SYS_INFO_LEN]; kernel.uname.sysname
> char vendor_code_name[SIGAR_SYS_INFO_LEN]; kernel.uname.release
> } sigar_sys_info_t;
> 
> This is used in matahari src/lib/host.c:host_get_architecture() and
> host_get_operating_system(), to fetch only a few fields. PCP's
> kernel.uname.distro ("Fedora release 13 (Goddard)") and pmda.uname
> ("Linux very.elastic.org 2.6.34.8-68.fc13.x86_64 #1 SMP Thu Feb 17
> 15:03:58 UTC 2011 x86_64") may be useful too.
> 
> Sigar:src/sigar.c, src/os/linux/linux_sigar.c,
> PCP:src/pmdas/linux/pmda.c
> src/os/win32/win32_sigar.c src/pmdas/windows/pmda.c
> 
> ------------------------------------------------------------------------
> sigar_mem_t / sigar_mem_get
> 
> typedef struct {
> sigar_uint64_t
> ram,
> total, mem.physmem, hinv.physmem
> used,
> free, mem.freemem
> actual_used,
> actual_free;
> double used_percent;
> double free_percent;
> } sigar_mem_t;
> 
> Used by matahari for host_get_memory/host_get_mem_free to fetch only a
> few fields. PCP also exposes a bunch of mem.util.* values from
> /proc/meminfo on linux.
> 
> Sigar:src/os/linux/linux_sigar.c PCP:src/pmdas/linux/pmda.c
> src/pmdas/windows/pmda.c
> ------------------------------------------------------------------------
> sigar_swap_t / sigar_swap_get
> 
> typedef struct {
> sigar_uint64_t
> total, mem.util.swapTotal, swap.length
> used, swap.used
> free, mem.util.swapFree, swap.free
> page_in, swap.in
> page_out; swap.out
> } sigar_swap_t;
> 
> Used by matahari for host_get_swap/host_get_swap_free to fetch only
> a few fields. PCP also exposes a bunch of swap.*, swapdev.* values on
> linux.
> 
> Sigar:src/os/linux/linux_sigar.c PCP:src/pmdas/linux/pmda.c
> src/pmdas/windows/pmda.c
> ------------------------------------------------------------------------
> sigar_cpu_info[_list]_t / sigar_cpu_info_list_get/destroy
> 
> typedef struct {
> char vendor[128]; hinv.cpu.vendor
> char model[128]; hinv.cpu.model
> int mhz;
> int mhz_max;
> int mhz_min;
> sigar_uint64_t cache_size;
> int total_sockets;
> int total_cores; n/a
> int cores_per_socket;
> } sigar_cpu_info_t;
> 
> typedef struct {
> unsigned long number; hinv.ncpu
> unsigned long size;
> sigar_cpu_info_t *data;
> } sigar_cpu_info_list_t;
> 
> Used by matahari in src/lib/hast.c:host_get_cpu_details to grab a few
> fields. The number-of-cores query could be hard-coded into matahari
> for now; PCP should be extended to provide the same info. OTOH the
> Sigar measure is heuristic (just runs on a single cpu, not on each of
> them), so it's already not reliable. See also PCP hinv.cpu.*.
> 
> Sigar:src/os/linux/linux_sigar.c PCP:src/pmdas/linux/pmda.c
> src/sigar_util.c
> 
> ------------------------------------------------------------------------
> sigar_net_info_t / sigar_net_info_get
> 
> typedef struct {
> char default_gateway[SIGAR_INET6_ADDRSTRLEN];
> char default_gateway_interface[16];
> char host_name[SIGAR_MAXHOSTNAMELEN]; n/a
> char domain_name[SIGAR_MAXDOMAINNAMELEN];
> char primary_dns[SIGAR_INET6_ADDRSTRLEN];
> char secondary_dns[SIGAR_INET6_ADDRSTRLEN];
> } sigar_net_info_t;
> 
> Used in matahari src/lib/utilities.c:matahari_hostname() to fetch just
> the host_name field. PCP does not appear to pass back such an
> uncooked gethostname() value, but matahari could call gethostname()
> directly.

A pmcd.hostname metric would be easy to add.  The matahari folk might
be avoiding making winsock calls at all?  (understandable, as its a
quirky API)

> Sigar:src/sigar.c(sigar_net_info_get)
> ------------------------------------------------------------------------
> sigar_net_interface_list_t / sigar_net_interface_list_get/destroy
> sigar_net_interface_config_t / sigar_net_interface_config_get/destroy
> 
> typedef struct {
> char name[16]; instance names from network.interface.*
> char type[64];
> char description[256];
> sigar_net_address_t hwaddr; n/a; should be network.interface.hwaddr
> sigar_net_address_t address; network.interface.inet_addr
> sigar_net_address_t destination;
> sigar_net_address_t broadcast;
> sigar_net_address_t netmask;
> sigar_net_address_t address6;
> int prefix6_length;
> int scope6;
> sigar_uint64_t
> flags,
> mtu, network.interface.mtu
> metric;
> int tx_queue_len;
> } sigar_net_interface_config_t;
> 
> typedef struct {
> unsigned long number;
> unsigned long size;
> char **data;
> } sigar_net_interface_list_t;
> 
> 
> Used by matahari in src/lib/network.c to gather all the interfaces and
> used in several places. PCP does not appear to provide as much
> per-interface ioctl(SIOCG*) data currently, nor IPv6. This represents
> a missing PCP feature that may take a few weeks to bring up to par.
> 
> Sigar:src/os/linux/linux_sigar.c PCP:src/pmdas/linux/proc_net_dev.c
> src/sigar.c src/pmdas/windows/pmda.c
> 
> ------------------------------------------------------------------------
> sigar_net_address_to_string
> 
> This function is approximately supplanted by the PCP
> network.interface.inet_addr metric, which gives strings back:
> 
> % pminfo -L -d -F network.interface.inet_addr
> network.interface.inet_addr
> Data Type: string InDom: 60.17 0xf000011
> Semantics: instant Units: none
> inst [0 or "lo"] value "127.0.0.1"
> inst [1 or "eth0"] value "192.168.1.1"
> 
> ------------------------------------------------------------------------
> sigar_proc_kill
> 
> This oddball kill(2) abstracter function could be pulled into
> matahari. The windows port consists of a dozen lines of code in
> Sigar:src/sigar_signal.c
> 
> ------------------------------------------------------------------------
> 
> I believe this represents a mapping between the Sigar stuff used by
> matahari, and what's available from equivalent local PCP sources. The
> PCP API, being a little more generic, is a little more wordy than
> Sigar, but not too much so. The only nontrivial amount of work here
> appears to be the network-interface metadata, which is rather richer
> in Sigar than in PCP.
> 
> The design of PCP makes it straightforward to extend it with data
> sources like this via out-of-tree PMDAs, so no change of the PCP core
> code is required to add the missing data. OTOH the PCP people are
> friendly to contributions, so merging such things into the master tree
> would probably not be a big deal.

*nod*, I'd expect no problem at all.

> Thanks for reading through all this. Any questions?
> 

I read only the brief intro to matahari on the site you pointed out,
and was surprised that the one metric here: "... capabilities ranging
from monitoring system uptime" is not in that list?  (uptime is also
a metric in PCP, on both Windows and Linux - kernel.all.uptime - I'd
expected it to be in your Sigar list too).

cheers.

-- 
Nathan

<Prev in Thread] Current Thread [Next in Thread>