pcp
[Top] [All Lists]

Re: Performance Co-Pilot patch for Compaq's Tru64

To: <kenmcd@xxxxxxxxxxxxxxxxx>
Subject: Re: Performance Co-Pilot patch for Compaq's Tru64
From: Phillip Ezolt <ezolt@xxxxxxxxxxxxxxxx>
Date: Tue, 16 Oct 2001 13:57:03 -0400 (EDT)
Cc: <pcp@xxxxxxxxxxx>, "Stanley, Dave" <Dave.Stanley@xxxxxxxxxx>, Bill French <William.French@xxxxxxxxxx>, Mark Goodwin <markgw@xxxxxxxxxxxxxxxxx>
In-reply-to: <Pine.LNX.4.21.0110162153370.32521-100000@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: owner-pcp@xxxxxxxxxxx
Ken,

> Did all your questions get answered?
>

Yes. They did.  I've been silent, because I am working on the PMDA for
Tru64.

My main problems have been:

1) Figuring out what a metric means, and if there is an equivalent metric
on Tru64.   (The text description with pminfo is a great help. )

2) Figuring out how to extract the performance metrics from Tru64.

It would be really nice, if SGI published a list of recommended
performance metrics.  That would give porters a target of 20-50
metrics to shoot for first.

> Mark, has the patch been rolled into the PCP open source code base?

I believe that somebody said that this was going to happen.

> And finally have you been able to use PCP on Tru64 in any serious
> performance analysis tasks, and if so, do you have any feedback,
> comments or suggestions?

I have not done anything major yet, I am still writing the PMDA for Tru64.

First, let me say:  I am VERY impressed with the performance co-pilot.

It has many of the features that I believe are necessary for the "next
generation" of performance tools. (Network-aware, flexible metrics,
separation of clients and PMDAs)

Suggestions:
        1) It is wonderful that PCP has documentation on how to write a PMDA.
           It would be helpful to have a short two-three paragraph overview of
           how a piece of data is extracted from the operating system.
           (From Initialization of PMDA to distribution to the clients.)

                (A state diagram would be wonderful! Actually ANY pictures
                would make it easier.)

           The documentation seems to delve into details too quickly.
           A general overview at first would make things much easier
           to understand.

        2) Make pmdumptext GPL.
           This seems to be the first program that anyone would
           rewrite. I can't imagine it contains any trade secrets,
           GPLing this would make it easier to see the benefits of PCP
           more quickly.

        3) Create a "suggested practices" documentation to describe what the
           convention should be for various things.  Let the
           programmer do as he/she pleases but make suggestions.

           (Common metrics and how to describe hardware structure come
           to mind.)

        4) Talk to other Linux monitoring projects that are trying to
           reinvent PCP and stop them. ;-)  (ksysguard)

           I whole-heartedly believe that PCP should be the performance
           monitoring standard for Linux.  If everyone that was developing
           performance monitoring tools for different Linux systems, started
           developing for PCP everyone would benefit.

Questions:

        1) How does PCP export hardware STRUCTURE?  (Which drives are
           connected to which busses?)  I'm sure I could create my own metrics
           with this information, but I would like to use what is
           already there.

--Phil

Compaq:   High Performance Server Systems Quality & Performance Engineering
---------------- Alpha, The Fastest Processor on Earth --------------------
Phillip.Ezolt@xxxxxxxxxx                         Performance Tools/Analysis
------------------- See the results at www.spec.org -----------------------

On Tue, 16 Oct 2001 kenmcd@xxxxxxxxxxxxxxxxx wrote:

> Sorry Phil, Dave and Bill , this mail was misplaced in the bog of eternal
> stench (aka my inbox) and was only recently re-discovered ...
>
> With reference to the mail below and the subsequent follow-ups ...
>
> Did all your questions get answered?
>
> Mark, has the patch been rolled into the PCP open source code
> base?
>
> And finally have you been able to use PCP on Tru64 in any serious
> performance analysis tasks, and if so, do you have any feedback,
> comments or suggestions?
>
> Thanks, and apologies again for my tardiness.
>
> On Mon, 27 Aug 2001, Phillip Ezolt wrote:
>
> > Hi All,
> >
> >     I've patched the Performance Co-Pilot infrastructure to work
> > with Tru64.  All of the clients (except for pmstat) that I've tested work.
> > The included PMDAs all compile except for cisco and shping.
> >
> > Known issues:
> >     1) cisco and  pmdas do not compile.  (Missing sys/prctl)
> >     2) No test for whether to use "hostname -f" or "hostname".
> >     3) Magic file format is not compatible with Tru64.
> >     4) Testing for the "runlevel" command is not done properly in
> >        the shell scripts.
> >     5) No Tru64 specific PMDA.
> >
> > Questions/Comments:
> >
> > 1) What pmda number should I use for Tru64? (Will 74 work?)
> >
> > 2) The memory values in the Linux pmda should be 64-bit, not 32-bit.
> >    Problems show up when a machine has more than 4-gig of memory.
> >
> > /* mem.util.used */
> >     { &proc_meminfo.mem[1],
> >       { PMDA_PMID(CLUSTER_MEMINFO,1), PM_TYPE_U32, PM_INDOM_NULL, 
> > PM_SEM_INSTANT,
> >       PMDA_PMUNITS(1,0,0,PM_SPACE_BYTE,0,0) }, },
> >
> >
> > Hopefully, the patch is self explanitory.  I had to add some automake
> > checks in for things that incorrectly pcp assumed.
> >
> > If you need me to test or explain anything, just tell me!
>
>


<Prev in Thread] Current Thread [Next in Thread>