pcp
[Top] [All Lists]

Query on cluster measurement

To: pcp@xxxxxxxxxxx
Subject: Query on cluster measurement
From: Mark_H_Johnson@xxxxxxxxxxxx
Date: Fri, 3 Aug 2001 12:43:15 -0500
Sender: owner-pcp@xxxxxxxxxxx
We are looking at using PCP for measuring information on our cluster of
PC's and have a few questions...

To set the stage, our network looks something like...


  Workstation(s)
   |   |   |  |
---------+----------
         |
      Head Node
         |
      Switch (private LAN)
         |
---------+----------
   |   |   |  |
  Compute Nodes
   |   |   |  |
  Other Equipment

The head node is NOT a router - workstations can't see the compute nodes
(nor the other equipment) with TCP/IP.

We would prefer to run the monitoring tools on one or more workstations. We
would prefer to run the agents on both the compute nodes and head node. We
would prefer to collect the data at the head node for distribution to the
workstations. [I think I got the terminology right...] All the machines are
running Linux, and we have PCP 2.2.1 downloaded and installed on all of the
machines that will be doing this.

(1) In a few places, the documentation says that the collector works with
local agents. But in the man page for pmcd(1), it indicates that socket
connections are supported. Is there some way we can gather key data items
from the compute node, send them to the head node [socket connection?] &
include them in the head node's name space? If not, do you have suggestions
for implementing such a capability?

(2) In lieu of an elegant solution to (1) - could we use remote shell to
the compute nodes, use pminfo to the dump data & import w/ the ASCII
interface to pmcd?

(3) We want to measure data transfer rates to the other equipment. We were
looking at getting data out of /proc, but we have function interfaces
available as well. Should we just filter the /proc output similar to that
done by the Linux agent or use code instead?

(4) Was there additional work done in ACE (Advanced Cluster Environment)
that may have implemented this already? If so, who should we contact at SGI
for more information?

Thanks.
--Mark H Johnson
  <mailto:Mark_H_Johnson@xxxxxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>
  • Query on cluster measurement, Mark_H_Johnson <=