xfs
[Top] [All Lists]

Re: XFS hang in xlog_grant_log_space

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: XFS hang in xlog_grant_log_space
From: Nick Piggin <npiggin@xxxxxxx>
Date: Fri, 30 Jul 2010 00:05:46 +1000
Cc: Nick Piggin <npiggin@xxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <20100728131744.GS7362@dastard>
References: <20100722190100.GA22269@amd> <20100723135514.GJ32635@dastard> <20100727070538.GA2893@amd> <20100727080632.GA4958@amd> <20100727113626.GA2884@amd> <20100727133038.GP7362@dastard> <20100727145808.GQ7362@dastard> <20100728131744.GS7362@dastard>
User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Jul 28, 2010 at 11:17:44PM +1000, Dave Chinner wrote:
> Something very strange is happening, and to make matters worse I
> cannot reproduce it with a debug kernel (ran for 3 hours without
> failing). Hence it smells like a race condition somewhere.
> 
> I've reproduced it without delayed logging, so it is not directly
> related to that functionality.
> 
> I've seen this warning:
> 
> Filesystem "ram0": inode 0x704680 background reclaim flush failed with 117
> 
> Which indicates we failed to mark an inode stale when freeing an
> inode cluster, but I think I've fixed that and the problem still
> shows up. It's posible the last version didn't fix it, but....

I've seen that one a couple of times too. Keeps coming back each
time you echo 3 > /proc/sys/vm/drop_caches :)


> Now I've got the ag iterator rotor patch in place as well and
> possibly a different version of the cluster free fix to what I
> previously tested and it's now been running for almost half an hour.
> I can't say yet whether I've fixed the bug of just changed the
> timing enough to avoid it. I'll leave this test running over night
> and redo individual patch testing tomorrow.

I reproduced it with fs_stress now too. Any patches I could test
for you just let me know.

<Prev in Thread] Current Thread [Next in Thread>