XFS hang in xlog_grant_log_space
Dave Chinner
david at fromorbit.com
Tue Jun 8 10:23:35 CDT 2010
On Fri, Jul 30, 2010 at 12:05:46AM +1000, Nick Piggin wrote:
> On Wed, Jul 28, 2010 at 11:17:44PM +1000, Dave Chinner wrote:
> > Something very strange is happening, and to make matters worse I
> > cannot reproduce it with a debug kernel (ran for 3 hours without
> > failing). Hence it smells like a race condition somewhere.
> >
> > I've reproduced it without delayed logging, so it is not directly
> > related to that functionality.
> >
> > I've seen this warning:
> >
> > Filesystem "ram0": inode 0x704680 background reclaim flush failed with 117
> >
> > Which indicates we failed to mark an inode stale when freeing an
> > inode cluster, but I think I've fixed that and the problem still
> > shows up. It's posible the last version didn't fix it, but....
>
> I've seen that one a couple of times too. Keeps coming back each
> time you echo 3 > /proc/sys/vm/drop_caches :)
Yup - it's an unflushable inode that is pinning the tail of the log,
hence causing the log space hangs.
> > Now I've got the ag iterator rotor patch in place as well and
> > possibly a different version of the cluster free fix to what I
> > previously tested and it's now been running for almost half an hour.
> > I can't say yet whether I've fixed the bug of just changed the
> > timing enough to avoid it. I'll leave this test running over night
> > and redo individual patch testing tomorrow.
>
> I reproduced it with fs_stress now too. Any patches I could test
> for you just let me know.
You should see them in a few minutes ;)
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list