[Top] [All Lists]

Re: xfs deadlock in stable kernel 3.0.4

To: Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx>
Subject: Re: xfs deadlock in stable kernel 3.0.4
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Tue, 20 Sep 2011 12:02:26 -0400
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, "xfs-masters@xxxxxxxxxxx" <xfs-masters@xxxxxxxxxxx>, aelder@xxxxxxx, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
In-reply-to: <4E78665E.8030409@xxxxxxxxxxxx>
References: <C6515E45-5724-43DD-95A8-1F89AFE29601@xxxxxxxxxxxx> <20110912200543.GA22409@xxxxxxxxxxxxx> <4E6EF274.7050007@xxxxxxxxxxxx> <20110913205018.GA8543@xxxxxxxxxxxxx> <4E70571A.80108@xxxxxxxxxxxx> <4E705C42.6020909@xxxxxxxxxxxx> <20110914143005.GA28496@xxxxxxxxxxxxx> <4E75B660.1030502@xxxxxxxxxxxx> <20110918230245.GF15688@dastard> <4E78665E.8030409@xxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Sep 20, 2011 at 12:09:34PM +0200, Stefan Priebe - Profihost AG wrote:
> Hi,
> any idea how to get deeper into this? I've tried using kgdb but
> strangely the error does not occur when kgdb is remote attached.
> When i unattach kgdb and restart bonnie the error happens again.
> So it seems to me a little bit like a timing issue?

Sounds like it.

Can you summarize all the data that we gather over this thread into one
summary, e.g.

 - what kernel does it happens?  Seems like 3.0 and 3.1 hit it easily,
   2.6.38 some times, 2.6.32 is fine.  Did you test anything between
   2.6.32 and 2.6.38?
 - what hardware hits it often/sometimes/never?
 - what is the fs geometry?
 - what is the hardware?
 - is this a 32 or 64-bit kernel, or do you run both?

I'm pretty sure most got posted somewhere, but let's get a summary
as things was a bit confusing sometimes.

Note that 2.6.38 moved the whole log grant code to a lockless algorithm,
so this might be a likely culprit if you're managing to hit race windows
no one else does, i.e. this really is a timing issue.

<Prev in Thread] Current Thread [Next in Thread>