xfs
[Top] [All Lists]

Re: Infinite loop in xfssyncd on full file system

To: Stephane Doyon <sdoyon@xxxxxxxxx>
Subject: Re: Infinite loop in xfssyncd on full file system
From: Luciano Chavez <lnx1138@xxxxxxxxxx>
Date: Wed, 23 Aug 2006 14:10:59 -0500
Cc: David Chinner <dgc@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.64.0608231056370.3139@xxxxxxxxxxxxxxxxxxxxx>
Organization: IBM
References: <Pine.LNX.4.64.0608221318300.3139@xxxxxxxxxxxxxxxxxxxxx> <20060823040218.GC807872@xxxxxxxxxxxxxxxxx> <20060823044829.GD807872@xxxxxxxxxxxxxxxxx> <Pine.LNX.4.64.0608231056370.3139@xxxxxxxxxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
On Wed, 2006-08-23 at 11:00 -0400, Stephane Doyon wrote:
> On Wed, 23 Aug 2006, David Chinner wrote:
> 
> > On Wed, Aug 23, 2006 at 02:02:18PM +1000, David Chinner wrote:
> >> On Tue, Aug 22, 2006 at 04:01:10PM -0400, Stephane Doyon wrote:
> >>> I'm seeing what appears to be an infinite loop in xfssyncd. It is
> >>> triggered when writing to a file system that is full or nearly full. I
> >>> have pinpointed the change that introduced this problem: it's
> >>>
> >>>     "TAKE 947395 - Fixing potential deadlock in space allocation and
> >>>     freeing due to ENOSPC"
> >>>
> >>> git commit d210a28cd851082cec9b282443f8cc0e6fc09830.
> >>
> >> Thanks for tracking that down - I've been trying to isolate a test case
> >> for another report of this looping in xfssyncd.
> >>
> >> [Luciano - this is the same problem we've been trying to track down.]
> >>
> >>> I hope you XFS experts see what might be wrong with that bug fix. It's
> >>> ironic but for me, this (apparent) infinite loop seems much easier to hit
> >>> than the out-of-order locking problem that the commit in question was
> >>> supposed to fix. Let me know if I can get you any more info.
> >>
> >> Now we know what patch introduces the problem, we know where to look.
> >> Stay tuned...
> >
> > I've had a quick look at the above commit. I'm not yet certain that
> > everything is correct in terms of the semantics laid down in the
> > change or that enough blocks are reserved for btree splits , but I
> 
> I actually tried, naively, to bump up SET_ASIDE_BLOCKS from 8 to 32. I 
> won't claim to understand half of what's going on but I wondered whether 
> that might make the problem noticeably harder to reproduce at least, but 
> it had no effect ;-).
> 
> > can see a hole in the implementation on multiprocessor machines.
> >
> > Stephane/Luciano - can you test the following patch (note: compile
> > tested only) and see if it fixes the problem?
> 
> I just tried it, unfortunately no effect. Stil went into a loop, on the 
> second attempt.
> 

Yes, unfortunetly it had no effect here either.

> Thanks
> 
-- 
Luciano Chavez <lnx1138@xxxxxxxxxx>
IBM


<Prev in Thread] Current Thread [Next in Thread>