lockmeter
[Top] [All Lists]

Re: Lockmeter 1.4.10 problem

To: John Hawkes <hawkes@xxxxxxx>
Subject: Re: Lockmeter 1.4.10 problem
From: Maneesh Soni <maneesh@xxxxxxxxxx>
Date: Mon, 17 Dec 2001 09:30:06 +0530
Cc: hawkes@xxxxxxxxxxx, lse-tech <lse-tech@xxxxxxxxxxxxxxxxxxxxx>, lockmeter@xxxxxxxxxxx
In-reply-to: <004101c184ca$bee31d60$6601a8c0@xxxxxxxxx>; from hawkes@xxxxxxx on Fri, Dec 14, 2001 at 10:11:25AM -0800
References: <20011214202613.C17111@xxxxxxxxxx> <004101c184ca$bee31d60$6601a8c0@xxxxxxxxx>
Reply-to: maneesh@xxxxxxxxxx
Sender: owner-lockmeter@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
On Fri, Dec 14, 2001 at 10:11:25AM -0800, John Hawkes wrote:

> I'm guessing that you're looking at i386, which I believe means that the
> code that executes is found in arch/i386/lib/dec_and_lock.c, right?

Yes I am looking at i386 code.

> That implementation has a "fast path" that uses a direct asm sequence
> using cmpxchgl, and a "slow path" that invokes spin_lock() and
> spin_unlock().  Lockmeter, of course, will only instrument this "slow
> path".  You simply won't see any 1.4.9 statistics for the "fast path".
> 
> What I did in 1.4.10 was to force everything through the "slow path",
> which means that all those formerly invisible "fast path" spinlock
> acquisitions are now showing up as metered spin_lock/spin_unlock pairs.
>
> So I would argue that 1.4.9 not only aggregates lots of spinlock/unlock
> activity into appearing as atomic_dec_and_lock() calls -- which 1.4.10
> fixes to now make that activity visible in the actual spinlock
> acquisition routine, such as dput() -- but 1.4.9 also completely misses
> quite a few spinlock acquisitions that were going through that "fast
> path".  So I would argue that 1.4.10 actually produces correct
> statistics, not simply *different* statistics, vs. 1.4.9.

The fast path does _not_ take global lock..and I don't think it is a
problem if lockmeter does not measure stats for the fast path. I think it
is more important to get correct stats for global lock acquisitions at-least for
scalability goals. By saying "correct" I mean "as it happens without lockmeter"
Without lockmeter all calls to atomic_dec_and_lock() do not acquire global lock
and lockmeter (v1.4.10) changes this behavior by forcing all calls to 
atomic_dec_and_lock() to slow path. I got mislead due to this as I had to 
investigate why there was 10 fold increase in dcache_lock() acquisitions from
dput().

The problem of aggregation can be solved if the whole of atomic_dec_and_lock()
is made in-line, not only the slow path code. 

Regards,
Maneesh

-- 
Maneesh Soni
IBM Linux Technology Center, 
IBM India Software Lab, Bangalore.
Phone: +91-80-5044999 email: maneesh@xxxxxxxxxx
http://lse.sourceforge.net/locking/rcupdate.html

<Prev in Thread] Current Thread [Next in Thread>