csa
[Top] [All Lists]

Proposal for Enhanced Accounting HOWTO

To: lse-tech@xxxxxxxxxxxxxxxxxxxxx
Subject: Proposal for Enhanced Accounting HOWTO
From: Guillaume Thouvenin <guillaume.thouvenin@xxxxxxxx>
Date: Fri, 3 Sep 2004 15:30:48 +0200
Cc: Tim Schmielau <tim@xxxxxxxxxxxxxxxxxxxxxx>, Arthur Corliss <corliss@xxxxxxxxxxxxxxxx>, Jay Lan <jlan@xxxxxxxxxxxx>, Erik <erikj@xxxxxxxxxxxxxxxxxx>, guillaume.thouvenin@xxxxxxxx, Limin Gu <limin@xxxxxxxxxxxxxxxxxx>, John Hesterberg <jh@xxxxxxx>, csa@xxxxxxxxxxx
Sender: csa-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.6+20040722i
                       Enhanced Accounting HOWTO
                       =========================
 
  According to discussion on the lse-tech mailing-list, it appears that
three steps (at least) are required to improve accounting. 

1) Improve accounting structure
   ----------------------------

   The current BSD-accounting structure doesn't have enough informations. 
Metrics computed by CSA module can be added to BSD accounting. According 
to other discussion (like Andi Kleen's comment on the patch I wrote when 
I wanted to add CSA IO values in the BSD accounting 
( http://lkml.org/lkml/2004/8/2/70 ) the current method to get metrics 
about blocks/char read/write is not accurate since most writes can be 
accounted by some pdflush threads. Maybe add a counter in the routine
mpage_writepages() but I don't know if we can recover a process ID from 
the struct page and I don't know if it will be enough... I'm looking if 
this is the right way.


2) Group of processes management
   ------------------------------

   We need to be able to manage groups of processes as it's clear that 
a major accounting improvement is the per-job accounting. I don't know if 
"job" is the right noun. There are several implementation that already exist
and some of them are already in the kernel. The property needed here is that 
if a process is in a container, its children will be in the same container. 
Different implementations can be:

        - PAGG + JOB (job)
        - ELSA (bank)
        - CKRM (class)
        - CPUSET (a cpuset of all CPUs can act as a container)

   The interface between kernel and user space application can be a new virtual
file system (like CKRM does with /rcfs) but it can also be a device driver with
ioctl operations (like ELSA does). Both solutions are interesting and need to
be discussed. We can notice that CPUSET is in 2.6.9-rc1-mm2 tree and it seems 
that PAGG has been removed (don't know why) from -mm tree.
   
3) Data presentation
   -----------------
        
    We can have several different implementations. 

4) General overview
   ----------------

          KERNEL SPACE        |   USER SPACE (or MODULE)             
                              |
         ----------------     | 
        | BSD accounting |    | 
        |       +        |    |     ------------
        |      CSA       |<=======>|            |
         ----------------     |    |  Enhanced  |
                              |    | Accounting |
            -------------     |    |    Core    |
           |  group of   |<=======>|            |
           |  processes  |    |     ------------
           |  manager    |    |
            -------------     |
                              |
                               
  Communication between EAC (Enhanced Accounting Core) and kernel space can be
done via virtual fs or a device. The goal of EAC is to keep a trace of the
different groups of processes during the accounting period using the group of
processes management module. The idea is that the group of processes management
can send a message to EAC when a process ended to indicate the job
ID if any. The other thing that has to be done by EAC is the data presentation.

  Here is an example of what can be done:

  1) First we can add processes into containers using the GPM (group of 
     processes manager) module. For example we can add an ftp server with 
     pid #123 and a daemon ssh with pid #234. Thus, inside the GPM you have:
         container 1 -> 123
         container 2 -> 234

  2) Now, a user can login via ssh, so, sshd will create new children. Thus, 
     inside the GPM you will have something like:
         container 1 -> 123
         container 2 -> 234 333 334 335

  3) As soon as a process is terminated, the EAC must be aware of that fact 
     (using a signal from GPM for example). There is just this modification
     that is needed. With the signal, the GPM will transfer information like
     pid, jib (job ID), command, ... to the EAC. This will allow to keep trace
     about what happens in the system. Thus, in the EAC we keep information 
     about all processes that are finished. 

  4) If the sysadmin asks information about process 234, the EAC will know that
     it belongs to container 2 and there are other processes in this container.
     As all processes are exited, accounting information are written in the 
     accounting file (like it is currently with BSD accounting). With those 
     informations (accounting + job information) we can do per-job accounting.

  I am working on the EAC implementation to check how we can make it work.

Any comments?

Best
Guillaume

<Prev in Thread] Current Thread [Next in Thread>