[Top] [All Lists]

Re: deadlock on vmap_area_lock

To: Shawn Bohrer <sbohrer@xxxxxxxxxxxxxxx>
Subject: Re: deadlock on vmap_area_lock
From: David Rientjes <rientjes@xxxxxxxxxx>
Date: Wed, 1 May 2013 10:01:41 -0700 (PDT)
Cc: xfs@xxxxxxxxxxx, linux-mm@xxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:date:from:x-x-sender:to:cc:subject:in-reply-to :message-id:references:user-agent:mime-version:content-type; bh=wUS/AaAH+mIoLu1MlJqwRyOnBtXqgaXhJvl1K3gNqx8=; b=S5p7ojiPYXLzuPSJ8yOiqL3FvtwdqzjCDItZ/dSYIL630X0FHBOJhHYbWIhvgnxjCW mvgNSw6GuaE4TjSfepOhRcO4JQJvtzlhcavzj2rsB6CfoTHAX6ODKCxovkU8uKvOFFgd 4WVImfLbU6e1rBpBkOiK/PBi82iwHk0s4CN2rEfzeemFTyQPByQtwe0eLHnGdXscm3fi LJg2UZrCdZqkjoCfAzmfe/3LO7IBNh+R6Rl4dyggSc5UmdVxWQ6ShPyncJdAu/u1IEdb FqJoV9DaAMR4Pjdk9DbVElmL9iei6aAQkfRhJ5Up1QTUwCermyI53C6gIjfjEGXbmBew niFg==
In-reply-to: <20130501164406.GC2404@xxxxxxxxxxxxxxxxxxxxxxxxx>
References: <20130501144341.GA2404@xxxxxxxxxxxxxxxxxxxxxxxxx> <alpine.DEB.2.02.1305010855440.4547@xxxxxxxxxxxxxxxxxxxxxxxxx> <20130501164406.GC2404@xxxxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)
On Wed, 1 May 2013, Shawn Bohrer wrote:

> Correct it doesn't and I can't prove the find command is not making
> progress, however these finds normally complete in under 15 min and
> we've let the stuck ones run for days.  Additionally if this was just
> contention I'd expect to see multiple threads/CPUs contending and I
> only have a single CPU pegged running find at 99%. I should clarify
> that the perf snippet above was for the entire system.  Profiling just
> the find command shows:
>     82.56%     find  [kernel.kallsyms]  [k] _raw_spin_lock

Couple of options to figure out what spinlock this is: use lockstat (see 
Documentation/lockstat.txt), which will also require a kernel rebuild, 
some human intervention to collect the stats, and the accompanying 
performance degradation, or you could try collecting
/proc/$(pidof find)/stack at regular intervals and figure out which 
spinlock it is.

> > Depending on your 
> > definition of "occassionally", would it be possible to run with 
> > CONFIG_PROVE_LOCKING and CONFIG_LOCKDEP to see if it uncovers any real 
> > deadlock potential?
> Yeah, I can probably enable these on a few machines and hope I get
> lucky.  These machines are used for real work so I'll have to gauge
> what how significant the performance impact is to determine how many
> machines I can sacrifice to the cause.

You'll probably only need to enable it on one machine, if a deadlock 
possibility exists here then lockdep will find it even without hitting it, 
it simply has to exercise the path that leads to it.  It does have a 
performance degradation for that one machine, though.

<Prev in Thread] Current Thread [Next in Thread>