pcp
[Top] [All Lists]

Re: introductory pcp questions

To: "Mark D. Anderson" <mda@xxxxxxxxxxxxxx>
Subject: Re: introductory pcp questions
From: Ken McDonell <kenmcd@xxxxxxxxxxxxxxxxx>
Date: Fri, 2 Feb 2001 06:01:41 +1100
Cc: <pcp@xxxxxxxxxxx>
In-reply-to: <00e001c085dc$73603c80$0201a8c0@xxxxxxx>
Reply-to: <kenmcd@xxxxxxx>
Sender: owner-pcp@xxxxxxxxxxx
On Wed, 24 Jan 2001, Mark D. Anderson wrote:

> i just ran into pcp while looking at another project on the oss.sgi page....
>
> i've looked through the web site and through the downloads for some kind
> of overview, and couldn't find any. i looked through the powerpoint slides
> by ken mcdonnel, and i've read "man PCPIntro" and "man PMAPI".
> somewhere between the trees and the earth should be a forest....

I'd consider http://oss.sgi.com/projects/pcp/ to be an overview of the
features in the open source release of PCP.  And the link to the IRIX
product via http://www.sgi.com/software/co-pilot/ is helpful and
mentioned on the oss page ... if I take your questions and my answers
and add these to the FAQ page, would that have helped?

> the man pages mention a "Programmer's Guide" and a "Tutorial", but
> perhaps those are only available as part of the commercial product?

The bools are available on-line:

Performance Co-Pilot User's and Administrator's Guide
http://techpubs.sgi.com/library/tpl/cgi-bin/browse.cgi?coll=0650&db=bks&cmd=toc&pth=/SGI_Admin/PCP_UAG

Performance Co-Pilot Programmer's Guide
http://techpubs.sgi.com/library/tpl/cgi-bin/browse.cgi?coll=0650&db=bks&cmd=toc&pth=/SGI_Developer/PCP_PG

These should be referenced on the oss web page, and I'll fix that.

The tutorial is shipped in the IRIX product ... it needs some work to
Linux-ise it and check it for correctness, and to cull the parts that
are not in the open source release ... I don't have the resources to do
this, but if someone wanted to volunteer I could turn the HTML over to
them.

> so far as i can make out, the basic architecture consists of:
> - console and GUI monitor clients that can subscribe to a real-time feed or 
> an archive feed
> - a per-host pmcd daemon that brokers between the clients and agents
> - pmda agent daemons that are per-host and per-namespace
> - a per-host pmlogger daemons that archive data locally, from pmcd to disk
> but i'm not even sure of that much.

Mostly correct.  The only open source GUI piece is pcpmon from Michal
Kara.  There are other GUI tools that SGI owns, but these remain
proprietary for both IRIX and Linux.

PMDAs are not per-namespace ... each PMDA supports some part of a
unified namespace, and the namespace is per-host, defined by the
set of PMDAs installed on that host.

You can have as many pmloggers as you like (each is collecting from a
single host), e.g. one system as a logger farm collecting logs for
multiple managed systems, or multiple loggers collecting logs from
the same machine, but for different purposes (tactical operations mgmt
vs. capacity planning, for instance).  The pmloggers can be on the
same system as PMCD or remote.

> here are some of the basic parameters i would have hoped to have found
> answers to and didn't, and what i've been able to divine so far:
>
> - license.
> http://oss.sgi.com/projects/pcp/license.html indicates this is (mostly) LGPL, 
> while
> the downloads indicate it is GPL (even the libraries). This is not a trivial 
> distinction.

The libraries are LGPL, everything else is GPL.  If you find anything
contrary to this, please let me know the details.

> - language support.
> it appears that agents and clients have to be written in C.

Pretty much.  Some clients are C++.  There has been some experimentation
with Perl, but C is the common choice.

> - security
> i see no provision for client or server authentication, provision for 
> encryption, integrity checks, etc.

Nope.  The simple model is IP-based allow/disallow of client connections
for PMCD and pmlogger.  The data is not that interesting ... 8^)>

> - clocks
> i see no provision for clock synchronization.

Nope.  This is system-level performance monitoring, with a bias for big
systems, so sampling of the order of a few seconds up to tens of
minutes ... clock skew in practise is not an issue.  We are not trying
to do event traces that require microsecond accuracy in the
timestamps.

> - query versus notify
> i can find no protocol definition, so i can't tell whether monitors must 
> query for particular data,
> or whether they can subscribe to asynchronous notifications.

See the Programmer's Guide for the protocol definitions and the
operational model.

The model is "synchonous pull", the clients explicitly ask for data
when they want it.  There is no push, broadcast, callback or other
asynchronous notification of data values, although pmie(1) can be used to
perform period sampling and raise asynchronous alarms (of any flavour)
when something interesting happens.

> - sampled vs. events
> said another way, can a monitor ask for qualitative events (threshold 
> passing), instead of regularly sampled snapshots?

Use the PMAPI directly for sampling (most of the monitoring tools are
like this).  Use pmie(1) for filtering and events.

> - connection vs. connectionless
> i can find no protocol definition, so i can't tell whether it is stateful or 
> not, let alone
> what provisions it has for resumption after a connection loss.

It is mostly connection oriented (TCP/IP between the client and PMCD).
The when a connection is lost, the client library will automatically
attempt reconnection with a controlled maximal rate of trying (uses a
variant of exponential back-off).  The error-handling regime for the
clients already supports "no data currently available" for lots of
reasons (like a PMDA is not installed or PMCD was restarted or lost the
connection to PMCD), so there is typically very little that the client
developer needs to do to handle this gracefully.

> nor can i tell whether it is message-per-row, message-per-request, or what.
> nor whether the protocol allows pipelining, or multiple asynchronous 
> requests, etc.
> nor whether it is the same protocol between monitor, pmcd, pmlogger, and pmda.

For monitors, once the initial meta data exchanges are complete, there
is typically one message to PMCD and one message back from PMCD for
each sample, independent of the number of metrics requested and the
number of instances (values) to be returned.

pmlogger is a monitor, so the same applies.

Each message from a monitor client is sent by PMCD to one or more PMDAs,
PMCD collates the messages back from each PMDA that was asked to
help and returns a single message to the client.  It is an important
part of the design that:
    - clients are ignorant of the de-multiplexing and multiplexing
      by PMCD
    - PMDAs are ignorant of each other
    - PMCD knows nothing, except how to act as a message switcher

> - agent-side computation
> obviously a monitor can compute anything it likes.
> but can a monitor request that a agent do some server-side computation before 
> sending
> the resulting data back, either across measurements (say, changing units or 
> adding together),
> or across time (running average, etc.).

This is certainly possible, but we've tended to discourage it ...
philosophically I believe any interval-based aggregation belongs in the
clients.  The PMDA cannot see the client state, so the PMDA does not
know which client it is responding to at the moment, so you'd need to
add some additional state using the pmStore(3) interface to selectively
modify state in the PMDA from a client (this is typically used to
toggle debug flags or enable optional instrumentation and changing units
would be in this category).

> - agent-side filtering
> similarly, what kinds of filters can a monitor request?

There is no streams-like concept.  Each monitor is free to choose
which metrics it wants, and for metrics with multiple instances the
client can select one, some or all of the instances to be returned
in each fetch.

> - fast localhost monitoring
> is there a shared memory or similar mechanism for monitoring an application's 
> "counters"
> without the overhead of tcp/ip communication?

Yes, see PM_CONTEXT_LOCAL in pmNewContext(3).

> - triggers
> i see no indication of what external integrations have been done for actions 
> to be taken
> based on various events (paging, email, etc.)

pmie(1) actions are arbitrary ... there are some canned ones, but then
there is the "execute this command" action ... the latter has been used
to do pager events, and integrate events into larger frameworks like
OpenView, UniCenter TNG, Enlighten DSM, ESP (from SGI).

> - agent collapsing of requests
> if 10 monitors ask agent for the same regularly sampled data, does it measure 
> it 10 times, or just once?

Up to the agent.  For expensive to collect information some agents use
a "return most recent observation" strategy, others use time decayed
caches, others collect on demand.

> if i ask for both user and system time, will it be smart enough to do this in 
> one operation, not two?

Absolutely.  More to the point, if you ask for user time, system time,
per disk read transfer rate, network packet rates in and out and apache
web stats in one request, you'll get all of the answers back in a
single message round trip.

> - discovery
> can an agent automatically discover a monitor? can a monitor automatically 
> discover an agent?

Agents are ignorant of clients (except via a pmStore(3) channel).
Clients discover agents mostly through the namespace at the PMCD (or
archive), but also through the "no data currently available" error
codes when requests are made.

> - metadata
> i found some discussion of a name to oid mapping by pmns, but no definition 
> of how a monitor
> queries the schema or instances available from a particular agent, nor if 
> there is a way to
> get notified of schema evolution, or instance addition/deletion.
> nor can i find an explanation of what can be declared about a name besides 
> its type.

The client explores the namespace using pmGetChildren(3) and
pmTraversePMNS(3) for either one-level at a time expansion or recursive
expansion.

Once the client has the name(s) of the metrics of interest,
pmLookupName(3) returns PMIDs and then pmLookupDesc(3) will return this
descriptor for a metric:
        - unique PMID
        - value data type (32, U32, 64, U64, FLOAT, DOUBLE, STRING, AGGREGATE)
        - instance domain number or PM_INDOM_NULL if singular
        - value semantics (counter, instantaneous, discrete)
        - units (dimension and scale in the axes time, space and events)

Using the instance domain number from the metric descriptor, the
routines pmLookupInDom(3), pmGetInDom(3) and pmNameInDom(3) allow
the instance domain to be browsed and queried.

See also pmLookupText(3) and pmLookupInDomText(3) for help text about
metrics and instances.

To see all of the gory details, turn on PDU tracing and run a simple
pminfo(1) commands, e.g.

        $ pminfo -D PDU kernel.all.cpu
        $ pminfo -D PDU -fdT kernel.all.load

> - naming
> can a value be given a universally unique path identifier such as 
> "host=bar;process=89;thread=98;request_count"?
> i suppose this would be part of that protocol definition....

You could achieve something like this in 2 ways:

a) connect to PMCD on host bar, ask for the instance name
   "process=89;thread=98" of the metric named "request_count", or
b) have your PMDA export the metric named "request_count" with an instance
   name "host=bar;process=89;thread=98"

Instance names are totally controlled by each PMDA, the only requirements
are that the external names of the instances are unique and that the PMDA
associates a unique 32-bit number with each external name.

> - strict definition of measured values
> concepts such as "free memory" and "pages of io" are essentially meaningless 
> without
> very strict definitions of what is counted and how. The specifics vary from 
> OS to OS, and
> sometimes from OS version to version. It appears there has been no effort in 
> pcp
> to alleviated this chaos; the linux pmda just passes on the /proc information 
> in all
> its ill-defined glory.

We did not start out with the charter to fix this.  Think of PCP as a
framework for transporting and archiving the chaos!

However we tend to convert things in the agents to sane units, so none
of this ticks and blocks nonsense, we report msec and Kbytes.

The semantics in the metric descriptors and the help text are a
surprisingly big help.

Because the collection of metrics is dynamic and extensible, please
feel free to add new agents or add metrics to existing agents to help
improve the situation.

> - standards compliance
> pcp seems to use no standards, even derivatively, for anything:
> transport-level protocol, data and log file formats, metadata representation,
> names of particular measured values, the client or agent apis, or the name to 
> number resolution.

No.  We did a BIG survey 7 years ago when this started, and all of the
competing options were lame in the extreme for our operational
environment (big machines, lots of optional bits, lots of evolutionary
change, data of interest hiding everywhere).

> that's all for now :).
>
> thanks...
>
> -mda
>
>
>


<Prev in Thread] Current Thread [Next in Thread>