pcp
[Top] [All Lists]

Re: Building PCP on Red Hat Alpha

To: eroman@xxxxxxx
Subject: Re: Building PCP on Red Hat Alpha
From: markgw@xxxxxxxxxxxxxxxxxxxxxxxxx (Mark Goodwin)
Date: Fri, 11 Feb 2000 11:10:29 -0500
Cc: ptg@xxxxxxxxxxxxxxxxxxxxxxx, pcp@xxxxxxxxxxx
In-reply-to: Eric Roman <eroman@xxxxxxx> "Re: Building PCP on Red Hat Alpha" (Feb 10, 15:47)
References: <19991223144816.B31718@xxxxxxxxxxxxx> <eroman@xxxxxxx> <9912241027.ZM1125771@xxxxxxxxxxxxxxxxxxxxxxxxx> <20000107165916.A3252@xxxxxxxxxxxx> <10001171227.ZM2918@xxxxxxxxxxxxxxxxxxxxxxxxx> <20000118140640.A12888@xxxxxxxxxxxx> <10001190937.ZM50079@xxxxxxxxxxxxxxxxxxxxxxxxx> <10002101635.ZM22392@xxxxxxxxxxxxxxxxxxxxxxxxx> <20000210154746.A12640@xxxxxxxxxxxx>
Reply-to: markgw@xxxxxxx
Sender: owner-pcp@xxxxxxxxxxx
On Feb 10, 15:47, Eric Roman wrote:
> Subject: Re: Building PCP on Red Hat Alpha
> > Hi,
> >
> > I'm wondering if I could send you the PCP 2.1.4 SRPM for testing
> > on linux(alpha).  And if it works, build me an RPM!
>
> Well, can I get the RPM?  ;)

OK, it's coming later today.

>
> BTW,
>
> Can you tell me more about how you see PCP integrating with technical
> computing clusters?

PCP was designed from the ground up to scale to large technical compute
environments, such as those that SGI sells. The protocols are efficient.
See the attached image for an example of using PCP to monitor a large
installation (site name withheld). The image shows a gui tool monitoring
1024 CPUs spread over a cluster of 9 hosts (it could have been 512
hosts, each with 2 CPUs, or whatever).

>
> I wouldn't mind writing a PMA for PBS or a Perl interface to your client
> library (to write other clients), but I need to show my boss that it's worth
> investing my time to do these things.

See my earlier mail for pointers to relevant documentation. A perl interface
is interesting. Nathan: have you already made any progress on this?

>
> I am particularly worried about PCP's ability to scale to larger numbers
> of systems.  I envision using PCP to monitor, say, disk traffic, message
> passing and CPU flops rates for 500 hosts running say a reconstruction
> code or fluid code.

PCP will handle that kind of environment (it has already done so at LANL
and other sites).

>
> So, do you think PCP can be at all useful for running _large_ clusters
> of Linux machines this way?

certainly

>
> Do you think PCP would be a useful framework for building other system-level
> services?  Say a fine-grained dynamic load-balancing framework for EP tasks?
>

It has already been used to monitor and trigger dynamic load balancing
in large clustered transaction/rdbms and httpd serving environments.
The pmie(1) tool is most useful for this.

    -- Mark

GIF image

<Prev in Thread] Current Thread [Next in Thread>