pcp
[Top] [All Lists]

Re: [pcp] Developers meeting summary, 29/02/2012

To: pcp@xxxxxxxxxxx
Subject: Re: [pcp] Developers meeting summary, 29/02/2012
From: Max Matveev <makc@xxxxxxxxx>
Date: Thu, 8 Mar 2012 16:32:36 +1100
Cc: bpm@xxxxxxx
In-reply-to: <1072107515.240278.1330510757294.JavaMail.root@xxxxxxxxxxxxxxxxxxxxxx>
References: <1403290077.240154.1330510094961.JavaMail.root@xxxxxxxxxxxxxxxxxxxxxx> <1072107515.240278.1330510757294.JavaMail.root@xxxxxxxxxxxxxxxxxxxxxx>
On Wed, 29 Feb 2012 21:19:17 +1100 (EST), Nathan Scott wrote:

 nathans> 2. Moved on to discussion of issues Max has encountered while 
 nathans> working through the NFS client stats PMDA. Issues around: 
 nathans> - How to handle the instance domain, which is per-mounted 
 nathans> filesystem but sometimes (when kernel decides to share 
 nathans> struct superblock for a client NFS mount - same server, 
 nathans> and same mount options) the instances will share the same 
 nathans> values. This is unexpected from a users POV, since the 
 nathans> I/O went to one mount point or the other, yet both update. 
 nathans> - Options include having a single instance for these shared 
 nathans> mounts (assuming correct identification possible), using 
 nathans> just whichever mount point is observed first as external 
 nathans> instance name. But, would lead to even more confusing 
 nathans> behaviour if that mount point goes away, but the other 
 nathans> remains - stats then reported for an unmounted path. 
 nathans> - For further details, see Max's promised mail. 

Here is an example of what I'm trying to deal with - in
/proc/self/mountstats each NFS mount has a block of data which looks
like this:

$ cat /proc/self/mountstats
....
device 192.168.108.128:/mnt/data/ mounted on /mnt/nfs with fstype nfs4 
statvers=1.0
        opts: 
rw,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.108.129,minorversion=0
        age:    13
        caps:   caps=0x7ff6,wtmult=512,dtsize=4096,bsize=0,namlen=255
        nfsv4:  bm0=0xffffefff,bm1=0xf9fe3e,acl=0x0
        sec:    flavor=1,pseudoflavor=1
        events: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
        bytes:  0 0 0 0 0 0 0 0 
        RPC iostats version: 1.0  p/v: 100003/4 (nfs)
        xprt:       tcp 973 0 1 0 13 19 19 0 19 0
        per-op statistics
                NULL: 0 0 0 0 0 0 0 0
                READ: 0 0 0 0 0 0 0 0
               WRITE: 0 0 0 0 0 0 0 0
              COMMIT: 0 0 0 0 0 0 0 0
....


If I were to mount /mnt/data_new from the same 192.168.108.128 server
with the same NFS mount options then I'll have another entry which
will have its own block in /proc/self/mountstats but will share all
the counters for events, bytes, per-op statistics with /mnt/data e.g.

device 192.168.108.128:/mnt/data_new mounted on /mnt/nfs_new with fstype nfs4 
statvers=1.0
        opts: 
rw,vers=4,rsize=32768,wsize=32768,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.108.129,minorversion=0
        age:    13
        caps:   caps=0x7ff6,wtmult=512,dtsize=4096,bsize=0,namlen=255
        nfsv4:  bm0=0xffffefff,bm1=0xf9fe3e,acl=0x0
        sec:    flavor=1,pseudoflavor=1
        events: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
        bytes:  0 0 0 0 0 0 0 0 
        RPC iostats version: 1.0  p/v: 100003/4 (nfs)
        xprt:       tcp 973 0 1 0 13 19 19 0 19 0
        per-op statistics
                NULL: 0 0 0 0 0 0 0 0
                READ: 0 0 0 0 0 0 0 0
               WRITE: 0 0 0 0 0 0 0 0
              COMMIT: 0 0 0 0 0 0 0 0

Now if I want to present user with some information about NFS
operations I can use mount path as external instance identifier
so she can fetch things like nfsclient.nfs4.reqs.getattr.count["/mnt/nfs"].
This creates two problems:

1. because counters are per nfs superblock (the structure is named
   nfs_server in the kernel) and are shared between /mnt/nfs and
   /mnt/nfs_new any operation on /mnt/nfs will be visible in
   /mnt/nfs_new as well. The user will be right to complain that she
   didn't touch /mnt/nfs_new, why does it counters change.

2. If I collapse the instances so that only /mnt/nfs is visible (first
   entry in /proc/self/mountstats wins principle) and the user
   unmounts /mnt/nfs I'll still have to display counters for it - more
   confusion.

Ben decided to use exports to identify the instances but it does not
work either because one export can be mounted multiple times on a
single client and also because exports (things like
192.168.108.128:/mnt/data) suffer from the same counter aliasing
problem.

The only way to identify those things in unique way is to combine host
name and the mount options but it makes instance names exceedingly
long and pretty much useless. I can probably collapse it into
something shorter, e.g. 192.168.108.128_mount1, and provide a mapping
from both mount path and export to this "name" but it still
submarvelous.

The other option is to use local mount path for instances and write a
long explanation in the help text for each metric.

The third option is to fix the stats but it will require having them
per vfs_mount and Linux kernel guys are not too keep on this approach.

The other problem is transport statistics - the line

        xprt:       tcp 973 0 1 0 13 19 19 0 19 0

contains information about events assoicated with "transport" which
is a connection between a client and a server. The problem is that
this transport can be shared between multiple nfs_server structures
which means that level of aliasing is even larger.

Plus there is no way to identify the transport reliable - I cannot use
local protocol/port because it can change if the transport reconnects
and I cannot use remote protocol/host/port because single client can
seldomly have more then one connection of the same server.

So in this case I need to come up with a suitable scheme for nameing
instances and some way of detecting sharing.

If anyone who wasn't at that meeting has some better ideas I'm all
ears.

 nathans> - Would be helped by extension to the Perl PCP::PMDA module 
 nathans> to allow PMDAs to call the pmdaCache family of routines 
 nathans> (long overdue on my part). 

Yes, this would go a long way in making instance management suck a bit
less.

max

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [pcp] Developers meeting summary, 29/02/2012, Max Matveev <=