xfs
[Top] [All Lists]

Re: Can you help me to investiage a soft lockup?

To: victor stinner <victor.stinner@xxxxxxxxxxxx>
Subject: Re: Can you help me to investiage a soft lockup?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 9 Oct 2014 08:23:46 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <734140413.8186921.1412777666412.JavaMail.zimbra@xxxxxxxxxxxx>
References: <1211918248.8179166.1412774638265.JavaMail.zimbra@xxxxxxxxxxxx> <734140413.8186921.1412777666412.JavaMail.zimbra@xxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Oct 08, 2014 at 04:14:26PM +0200, victor stinner wrote:
> Hello,
> 
> I'm working on OpenStack, and we hit a bug on Swift (distributed
> storage). The Linux kernel 3.2 logged many "soft lockup" messages
> which looks to be related to XFS: see kernel messages at the end
> of this email (it's only an extract of first messages, there are
> more later).

It's contending on the AIL lock, which is then causing contention on
the next layer of locking above that (the iclog locks).

There were lots of optimisations for AIL contention issues around
the 3.2 timeframe as a result of the more widespread use of the
recently introduced delayed logging functionality (which was made
the default config in 3.2).

I'm pretty sure these symptoms were a result of a bug that caused
out-of-order items to be placed on the AIL, hence causing a
walk of the AIL to find the insertion point for every item being
added to the AIL rather than using the cursor to track the current
insertion point and avoid repeated insertion point lookups.

You best bet would be to upgrade to a more recent kernel rather than
try to indentify and backport a bunch of fixes to an old kernel...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>