lockmeter
[Top] [All Lists]

Re: Lockmeter 1.4.10 problem

To: <maneesh@xxxxxxxxxx>, <hawkes@xxxxxxxxxxx>
Subject: Re: Lockmeter 1.4.10 problem
From: "John Hawkes" <hawkes@xxxxxxx>
Date: Fri, 14 Dec 2001 10:11:25 -0800
Cc: "lse-tech" <lse-tech@xxxxxxxxxxxxxxxxxxxxx>, <lockmeter@xxxxxxxxxxx>
References: <20011214202613.C17111@xxxxxxxxxx>
Sender: owner-lockmeter@xxxxxxxxxxx
> The changes you have done for atomic_dec_and_lock() in Lockmeter
v1.4.10
> does not seem to give correct statistics.

I'm quite willing to believe my 1.4.10 is wrong, but I think the better
terminology is "does not seem to give identical statistics", not "does
not seem to give correct statistics."

> If you want to see what's the difference just look at the folowing
numbers
>
> With Lockmeter v1.4.10
> ----------------------
> 6.3%  9.2%  0.4us(1659us)  3.4us(1648us)( 1.3%)  23182304 90.8%  9.2%
0% dcache_lock
> 2.3%  6.4%  0.3us( 120us)  3.3us(1379us)(0.48%)  12336769 93.6%  6.4%
0% dput+0x18
>
> Making actual atomic_dec_and_lock inline
> ----------------------------------------
> 4.0%  8.6%  0.5us(1333us)  1.9us(1309us)(0.35%)  11940384 91.4%  8.6%
0% dcache_lock
> 0.3%  7.0%  0.4us( 191us)  1.9us(1197us)(0.03%)   1096658 93.0%  7.0%
0% dput+0x30
>
> There is a difference of more than 10 times in number of times
dcache_lock is
> taken from dput. I didnot face any problem in making the actual
> atomic_dec_and_lock() inline and could not understand why you choose
the less
> efficient version.

I'm guessing that you're looking at i386, which I believe means that the
code that executes is found in arch/i386/lib/dec_and_lock.c, right?
That implementation has a "fast path" that uses a direct asm sequence
using cmpxchgl, and a "slow path" that invokes spin_lock() and
spin_unlock().  Lockmeter, of course, will only instrument this "slow
path".  You simply won't see any 1.4.9 statistics for the "fast path".

What I did in 1.4.10 was to force everything through the "slow path",
which means that all those formerly invisible "fast path" spinlock
acquisitions are now showing up as metered spin_lock/spin_unlock pairs.

So I would argue that 1.4.9 not only aggregates lots of spinlock/unlock
activity into appearing as atomic_dec_and_lock() calls -- which 1.4.10
fixes to now make that activity visible in the actual spinlock
acquisition routine, such as dput() -- but 1.4.9 also completely misses
quite a few spinlock acquisitions that were going through that "fast
path".  So I would argue that 1.4.10 actually produces correct
statistics, not simply *different* statistics, vs. 1.4.9.

John Hawkes


<Prev in Thread] Current Thread [Next in Thread>