On Wed, 1 May 2013, Shawn Bohrer wrote:
> Correct it doesn't and I can't prove the find command is not making
> progress, however these finds normally complete in under 15 min and
> we've let the stuck ones run for days. Additionally if this was just
> contention I'd expect to see multiple threads/CPUs contending and I
> only have a single CPU pegged running find at 99%. I should clarify
> that the perf snippet above was for the entire system. Profiling just
> the find command shows:
> 82.56% find [kernel.kallsyms] [k] _raw_spin_lock
Couple of options to figure out what spinlock this is: use lockstat (see
Documentation/lockstat.txt), which will also require a kernel rebuild,
some human intervention to collect the stats, and the accompanying
performance degradation, or you could try collecting
/proc/$(pidof find)/stack at regular intervals and figure out which
spinlock it is.
> > Depending on your
> > definition of "occassionally", would it be possible to run with
> > CONFIG_PROVE_LOCKING and CONFIG_LOCKDEP to see if it uncovers any real
> > deadlock potential?
> Yeah, I can probably enable these on a few machines and hope I get
> lucky. These machines are used for real work so I'll have to gauge
> what how significant the performance impact is to determine how many
> machines I can sacrifice to the cause.
You'll probably only need to enable it on one machine, if a deadlock
possibility exists here then lockdep will find it even without hitting it,
it simply has to exercise the path that leads to it. It does have a
performance degradation for that one machine, though.