On Wed, 17 May 2000, Steve Daniels wrote:
> Ken,
> I think this is a simple question that you can answer off the top
> of your head. I am trying to write a pmie rule that will notify the
> admin team at XYZ Corp when a process gets out of hand with regard to
> memory consumption. Periodically, they have a couple of processes that
> consume the entire memory on the machine and we would like to catch
> them before the machine starts to swap a great deal. So, I am testng on
> a O2 and don't understand why these two rules don't seem to work:
>
> delta = 1 min;
> memoryhog =
> some_inst
> ( proc.memory.virtual.dat > 32 Mbyte )
> -> syslog 2 min "%i is consuming %v memory";
>
> memoryholder =
> some_inst
> ( proc.memory.virtual.bss > 32 Mbyte )
> -> syslog 2 min "%i is holding %v uninitialized memory";
>
> I use memclaim to grab 40 Mbytes of memory and check with
> pmval -i <memclaim pid> proc.memory.virtual.bss to verify that
> memoryholder should be satisfied, which it does show that
> proc.memory.virtual.bss = 40 Mbytes, but pmie never fires.
Steve, firstly sincere apologies ... I was away and then swamped when
I got back.
In Irix, pmie is never going to be able to do this ... it is not a
pmie problem but rather a proc PMDA issue ... right from the outset,
I decided (and it has been argued that this was a mistake) that fetching
metrics for _all_ the processes on a regular basis was not likely to be
helpful, and certainly could be expensive. So pmie, like any other
PCP client can fetch metrics for selected processes, but not for
_all_ processes ... to see what happens, try
$ pminfo -f proc.memory.virtual.dat
and compare with
$ pminfo -F proc.memory.virtual.dat
In the Linux implementation this restriction was relaxed, but fewer
metrics are available from the "proc" group. In fact, in Linux the
comparable rule might be
some_inst
proc.memory.size > 4 Mbyte
-> print "bingo:" " [%i] %v";
And starting and stopping your friendly web browser seems to prove it
actually works ...
bash$ pmie -t 30 </tmp/steve.pmie
Tue Jun 6 07:46:15 2000: bingo: [000984 /usr/X11R6/bin/Xwrapper] 5591040
Tue Jun 6 07:47:15 2000: bingo: [000984 /usr/X11R6/bin/Xwrapper] 5361664
Tue Jun 6 07:49:45 2000: bingo: [000984 /usr/X11R6/bin/Xwrapper] 5148672
[007170 /usr/lib/netscape/netscape-communicator] 10854400
Tue Jun 6 07:50:15 2000: bingo: [000984 /usr/X11R6/bin/Xwrapper] 5148672
[007170 /usr/lib/netscape/netscape-communicator] 10854400
Tue Jun 6 07:51:15 2000: bingo: [000984 /usr/X11R6/bin/Xwrapper] 5283840
Back to Irix ...
> Further, pmie -d shows that all the instances including memclaim
> are in the evaluation test. So, what am I missing? Should I be using
> the hotproc PMDA to do this?
Ah, now this is a bit tricky ... the -d option to pmie is pretty strange
in that it does not use the regular fetch scheduling path. A more
accurate check of the non-debug behaviour would be using -v and -D,
as in:
masala 9% pmie -v -Dfetch< /tmp/steve.pmie
pmFetch returns ...
pmResult dump from 0x100784e0 timestamp: 960134418.114805 09:00:18.114 numpmid:
2
3.5.3 (proc.memory.virtual.dat): Explicit instance identifier(s) required
3.5.5 (proc.memory.virtual.bss): Explicit instance identifier(s) required
memoryhog: ?
memoryholder: ?
pmFetch returns ...
pmResult dump from 0x100784e0 timestamp: 960134428.122448 09:00:28.122 numpmid:
2
3.5.3 (proc.memory.virtual.dat): Explicit instance identifier(s) required
3.5.5 (proc.memory.virtual.bss): Explicit instance identifier(s) required
memoryhog: ?
memoryholder: ?
Note the warnings/errors from the pmFetch, and the expression value is
? (not true or false) because there are no values to be used in the
evaluation.
The right tool here (for Irix) is indeed hotproc ... I configured
it thusly ...
masala 5# pminfo -f hotproc.nprocs
hotproc.nprocs
value 4
masala 6# pminfo -f hotproc.control
hotproc.control.refresh
value 60
hotproc.control.config
value "(virtualsize > 32768.000000)"
hotproc.control.config_gen
value 2
And then rewrote your rules to use hotproc.* in lieu of proc.*, and ...
masala 16% pmie -v < /tmp/steve.pmie
memoryhog: true
memoryholder: false
masala 17% tail -1 /var/adm/SYSLOG
Jun 4 09:09:29 5D:masala pcp-pmie[19354]: 0000001280 /usr/bin/X11/xdm is
consuming 34852864 memory0000001840 vmail is consuming 35123200 memory
and pmem confirms than vmail and xdm are the only two candidates on this
system.
|