i just ran into pcp while looking at another project on the oss.sgi page....
i've looked through the web site and through the downloads for some kind
of overview, and couldn't find any. i looked through the powerpoint slides
by ken mcdonnel, and i've read "man PCPIntro" and "man PMAPI".
somewhere between the trees and the earth should be a forest....
the man pages mention a "Programmer's Guide" and a "Tutorial", but
perhaps those are only available as part of the commercial product?
so far as i can make out, the basic architecture consists of:
- console and GUI monitor clients that can subscribe to a real-time feed or an
archive feed
- a per-host pmcd daemon that brokers between the clients and agents
- pmda agent daemons that are per-host and per-namespace
- a per-host pmlogger daemons that archive data locally, from pmcd to disk
but i'm not even sure of that much.
here are some of the basic parameters i would have hoped to have found
answers to and didn't, and what i've been able to divine so far:
- license.
http://oss.sgi.com/projects/pcp/license.html indicates this is (mostly) LGPL,
while
the downloads indicate it is GPL (even the libraries). This is not a trivial
distinction.
- language support.
it appears that agents and clients have to be written in C.
- security
i see no provision for client or server authentication, provision for
encryption, integrity checks, etc.
- clocks
i see no provision for clock synchronization.
- query versus notify
i can find no protocol definition, so i can't tell whether monitors must query
for particular data,
or whether they can subscribe to asynchronous notifications.
- sampled vs. events
said another way, can a monitor ask for qualitative events (threshold passing),
instead of regularly sampled snapshots?
- connection vs. connectionless
i can find no protocol definition, so i can't tell whether it is stateful or
not, let alone
what provisions it has for resumption after a connection loss.
nor can i tell whether it is message-per-row, message-per-request, or what.
nor whether the protocol allows pipelining, or multiple asynchronous requests,
etc.
nor whether it is the same protocol between monitor, pmcd, pmlogger, and pmda.
- agent-side computation
obviously a monitor can compute anything it likes.
but can a monitor request that a agent do some server-side computation before
sending
the resulting data back, either across measurements (say, changing units or
adding together),
or across time (running average, etc.).
- agent-side filtering
similarly, what kinds of filters can a monitor request?
- fast localhost monitoring
is there a shared memory or similar mechanism for monitoring an application's
"counters"
without the overhead of tcp/ip communication?
- triggers
i see no indication of what external integrations have been done for actions to
be taken
based on various events (paging, email, etc.)
- agent collapsing of requests
if 10 monitors ask agent for the same regularly sampled data, does it measure
it 10 times, or just once?
if i ask for both user and system time, will it be smart enough to do this in
one operation, not two?
- discovery
can an agent automatically discover a monitor? can a monitor automatically
discover an agent?
- metadata
i found some discussion of a name to oid mapping by pmns, but no definition of
how a monitor
queries the schema or instances available from a particular agent, nor if there
is a way to
get notified of schema evolution, or instance addition/deletion.
nor can i find an explanation of what can be declared about a name besides its
type.
- naming
can a value be given a universally unique path identifier such as
"host=bar;process=89;thread=98;request_count"?
i suppose this would be part of that protocol definition....
- strict definition of measured values
concepts such as "free memory" and "pages of io" are essentially meaningless
without
very strict definitions of what is counted and how. The specifics vary from OS
to OS, and
sometimes from OS version to version. It appears there has been no effort in pcp
to alleviated this chaos; the linux pmda just passes on the /proc information
in all
its ill-defined glory.
- standards compliance
pcp seems to use no standards, even derivatively, for anything:
transport-level protocol, data and log file formats, metadata representation,
names of particular measured values, the client or agent apis, or the name to
number resolution.
that's all for now :).
thanks...
-mda
|