pcp
[Top] [All Lists]

Re: PCP graphical interface

To: Michal Kara <lemming@xxxxxxxxxxxxxxxxxxx>
Subject: Re: PCP graphical interface
From: Ken McDonell <kenmcd@xxxxxxxxxxxxxxxxx>
Date: Thu, 13 Apr 2000 06:18:44 +1000
Cc: pcp@xxxxxxxxxxx
In-reply-to: <20000412141938.62918@arthur.plbohnice.cz>
Sender: owner-pcp@xxxxxxxxxxx
On Wed, 12 Apr 2000, Michal Kara wrote:

>   Hello!
> 
>       I have made another few improvements in the PCPMON and put up a web 
> page for
> it. You can find it at http://k332.feld.cvut.cz/~lemming/projects/pcpmon.html.
> Please let me know how do you like it.

This is a good initial effort at a GPL'd PCP stripchart monitor.

The following comments are all intended to be constructive to
encourage the development of pcpmon and any other similar monitoring
tools over the PCP protocols ...

 0. You probably want to consider changing the basic operational model
    for how you present the metrics in the Values setup dialog ... in
    addition to the issues of instance "numbers" vs instance "names" and
    memory footprint you've identified, you should consider ...

    a) the namespace can be much larger, e.g. on my local Irix server
         $ pminfo | wc -l
                   1552
    b) the number of instances can be even larger, e.g. on the same
       local Irix server
         $ pminfo -F | grep ' value ' | wc -l
                  27911
    c) the namespace can be different on different hosts
    d) the instances are expected to be different on different hosts
    e) metrics with the same name on different hosts do not necessarily
       have the same metric id, e.g. bruce is a Linux system, snort is
       an Irix system ...
         $ pminfo -m -h bruce kernel.all.load
         kernel.all.load PMID: 60.2.0
         $ pminfo -m -h snort kernel.all.load
         kernel.all.load PMID: 1.18.3
    f) instances with the same name on different hosts do not necessarily
       have the same instance number
    g) the namespace and instances change over time even on the same host

    All of the above arise from the fundamental architectural decision
    that in PCP it is the PMCD alone (or rather in concert with the local
    PMDAs) that decides what metrics and instances are available.  This
    breaks the tyranny of the SNMP MIB, forces the protocols to support the
    necessary name and meta data services and allows new sources of metrics
    to be added on any host without breaking anything else.

    I'd suggest a more scalable solution for the metric and instance
    selection might be a dynamic tree-selector where:

    a) only the top-level metrics are displayed by default, e.g. on
       my local Linux system this would be:

         +   disk
         +   filesys
         +   hinv
         +   kernel
         +   mem
         +   mpi
         +   network
         +   nfs
         +   pmcd
         +   proc
         +   rpc
         +   swap
         +   swapdev
         +   web

        Use pmGetChildren(3) or pmGetChildrenStatus(3) here.

     b) let the user expand any of the non-leaf entries, e.g.

         --| filesys
           +-  capacity
           +-  used
           +-  free
           +-  maxfiles
           +-  ...

        Use pmGetChildren(3) or pmGetChildrenStatus(3) again here.

     c) and finally let the user expand a leaf node into instances, e.g.

        --| filesys
          +-  capacity
          +-  used
          --| free
            * /dev/root
            * /dev/sda6
            * /dev/sda5
            * /dev/sdb1
            * ...

     This would be less of a memory foot print, easier for the user to
     navigate, the instances can be enumerated only when needed so it
     accommodates dynamic instance domains, and the dialog can be
     rebuilt when you connect to a new server (with a new namespace).

 1. I don't think your scaling model is going to work ... as soon as
    you have 2 metrics in the same chart you have 2 choices:
    a) auto-scale to a single y axis, or
    b) independently scale each plot and (optionally) annotate the
       left and right axes with the two scales.

    We chose a).  You've opted for b).  I think b) is less desirable
    because
         - the visual message is wrong (two values with a similar y
           value in the chart _should_ have the same _real_ value for
           the underlying metrics), and
         - this model does not scale beyond 2 plots in the same chart
        
 2. PCP metrics have metadata to describe the dimension and scale of the
    values.  We've found it important to use this to prevent "silly"
    combinations of metrics in the same chart, and to annotate the graph
    axes.

 3. I like the expression evaluator ... this is a special case of something
    called a "derived metric" that we've been fighting with for a long
    time ... the semantics of derived metrics is very tricky if you
    want to restrict the user to only "sensible" combinations ... if
    that is not a concern, then go for it.

 4. I had to upgrade from 1.0.0 to 1.8.7 for libxml and libxml-devel
    before pcpmon would link ... the unresolved symbol was
    xmlSubstituteEntitiesDefault ... you may want to check the
    dependencies, either in the configure script or your web page.

 5. And finally, a couple of trivial patches to remove some compilation
    warnings:

[kenmcd@bozo-pc src]$ diff -c file.c.orig file.c     
*** file.c.orig Thu Apr 13 16:06:57 2000
--- file.c      Thu Apr 13 16:11:04 2000
***************
*** 8,13 ****
--- 8,14 ----
  #include <gtk/gtkfilesel.h>
  #include <limits.h>
  #include <errno.h>
+ #include <string.h>
  #include <strings.h>
  #include <gnome-xml/parser.h>
  
[kenmcd@bozo-pc src]$ diff -c values.c.orig values.c
*** values.c.orig       Thu Apr 13 16:06:33 2000
--- values.c    Thu Apr 13 16:09:59 2000
***************
*** 7,12 ****
--- 7,13 ----
  #include <stdlib.h>
  #include <string.h>
  #include <stdio.h>
+ #include <math.h>
  #include <gtk/gtkclist.h>
  #include <gtk/gtkspinbutton.h>
  #include <gtk/gtkentry.h>

Keep up the good work.



<Prev in Thread] Current Thread [Next in Thread>