On Wed, 12 Apr 2000, Michal Kara wrote:
> Hello!
>
> I have made another few improvements in the PCPMON and put up a web
> page for
> it. You can find it at http://k332.feld.cvut.cz/~lemming/projects/pcpmon.html.
> Please let me know how do you like it.
This is a good initial effort at a GPL'd PCP stripchart monitor.
The following comments are all intended to be constructive to
encourage the development of pcpmon and any other similar monitoring
tools over the PCP protocols ...
0. You probably want to consider changing the basic operational model
for how you present the metrics in the Values setup dialog ... in
addition to the issues of instance "numbers" vs instance "names" and
memory footprint you've identified, you should consider ...
a) the namespace can be much larger, e.g. on my local Irix server
$ pminfo | wc -l
1552
b) the number of instances can be even larger, e.g. on the same
local Irix server
$ pminfo -F | grep ' value ' | wc -l
27911
c) the namespace can be different on different hosts
d) the instances are expected to be different on different hosts
e) metrics with the same name on different hosts do not necessarily
have the same metric id, e.g. bruce is a Linux system, snort is
an Irix system ...
$ pminfo -m -h bruce kernel.all.load
kernel.all.load PMID: 60.2.0
$ pminfo -m -h snort kernel.all.load
kernel.all.load PMID: 1.18.3
f) instances with the same name on different hosts do not necessarily
have the same instance number
g) the namespace and instances change over time even on the same host
All of the above arise from the fundamental architectural decision
that in PCP it is the PMCD alone (or rather in concert with the local
PMDAs) that decides what metrics and instances are available. This
breaks the tyranny of the SNMP MIB, forces the protocols to support the
necessary name and meta data services and allows new sources of metrics
to be added on any host without breaking anything else.
I'd suggest a more scalable solution for the metric and instance
selection might be a dynamic tree-selector where:
a) only the top-level metrics are displayed by default, e.g. on
my local Linux system this would be:
+ disk
+ filesys
+ hinv
+ kernel
+ mem
+ mpi
+ network
+ nfs
+ pmcd
+ proc
+ rpc
+ swap
+ swapdev
+ web
Use pmGetChildren(3) or pmGetChildrenStatus(3) here.
b) let the user expand any of the non-leaf entries, e.g.
--| filesys
+- capacity
+- used
+- free
+- maxfiles
+- ...
Use pmGetChildren(3) or pmGetChildrenStatus(3) again here.
c) and finally let the user expand a leaf node into instances, e.g.
--| filesys
+- capacity
+- used
--| free
* /dev/root
* /dev/sda6
* /dev/sda5
* /dev/sdb1
* ...
This would be less of a memory foot print, easier for the user to
navigate, the instances can be enumerated only when needed so it
accommodates dynamic instance domains, and the dialog can be
rebuilt when you connect to a new server (with a new namespace).
1. I don't think your scaling model is going to work ... as soon as
you have 2 metrics in the same chart you have 2 choices:
a) auto-scale to a single y axis, or
b) independently scale each plot and (optionally) annotate the
left and right axes with the two scales.
We chose a). You've opted for b). I think b) is less desirable
because
- the visual message is wrong (two values with a similar y
value in the chart _should_ have the same _real_ value for
the underlying metrics), and
- this model does not scale beyond 2 plots in the same chart
2. PCP metrics have metadata to describe the dimension and scale of the
values. We've found it important to use this to prevent "silly"
combinations of metrics in the same chart, and to annotate the graph
axes.
3. I like the expression evaluator ... this is a special case of something
called a "derived metric" that we've been fighting with for a long
time ... the semantics of derived metrics is very tricky if you
want to restrict the user to only "sensible" combinations ... if
that is not a concern, then go for it.
4. I had to upgrade from 1.0.0 to 1.8.7 for libxml and libxml-devel
before pcpmon would link ... the unresolved symbol was
xmlSubstituteEntitiesDefault ... you may want to check the
dependencies, either in the configure script or your web page.
5. And finally, a couple of trivial patches to remove some compilation
warnings:
[kenmcd@bozo-pc src]$ diff -c file.c.orig file.c
*** file.c.orig Thu Apr 13 16:06:57 2000
--- file.c Thu Apr 13 16:11:04 2000
***************
*** 8,13 ****
--- 8,14 ----
#include <gtk/gtkfilesel.h>
#include <limits.h>
#include <errno.h>
+ #include <string.h>
#include <strings.h>
#include <gnome-xml/parser.h>
[kenmcd@bozo-pc src]$ diff -c values.c.orig values.c
*** values.c.orig Thu Apr 13 16:06:33 2000
--- values.c Thu Apr 13 16:09:59 2000
***************
*** 7,12 ****
--- 7,13 ----
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
+ #include <math.h>
#include <gtk/gtkclist.h>
#include <gtk/gtkspinbutton.h>
#include <gtk/gtkentry.h>
Keep up the good work.
|