[Top] [All Lists]

Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim

To: Bas Couwenberg <bas@xxxxxxxxxxxxxxxx>
Subject: Re: 2.6.31 xfs_fs_destroy_inode: cannot reclaim
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Mon, 5 Oct 2009 17:43:48 -0400
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>, Patrick Schreurs <patrick@xxxxxxxxxxxxxxxx>, Tommy van Leeuwen <tommy@xxxxxxxxxxxxxxxx>, XFS List <xfs@xxxxxxxxxxx>
In-reply-to: <4AC60D27.9060703@xxxxxxxxxxxxxxxx>
References: <20090930124104.GA7463@xxxxxxxxxxxxx> <4AC60D27.9060703@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.19 (2009-01-05)
On Fri, Oct 02, 2009 at 04:24:39PM +0200, Bas Couwenberg wrote:
> Dear Christoph,
> Yesterday two of our servers ( + your patch) crashed again, this  
> time we have a bigger console, but not the full backtrace unfortunately.
> I did manage to get some more calltrace info from the logs, which I have  
> attached together with the screenshots of the crashscreens.
> I hope this info helps you.

It helps a bit, but not so much.  I suspect it could be a double free
of an inode, and I have identified a possible race window that could
explain it.  But all the traces are really weird and I think only show
later symptoms of something that happened earlier.  I'll come up with
a patch for the race window ASAP, but could you in the meantime turn on
CONFIG_XFS_DEBUG for the test kernel to see if it triggers somehwere
and additionally apply the tiny patch below for additional debugging?

Subject: xfs: check for not fully initialized inodes in xfs_ireclaim
From: Christoph Hellwig <hch@xxxxxx>

Add an assert for inodes not added to the inode cache in xfs_ireclaim, to make
sure we're not going to introduce something like the famous nfsd inode cache
bug again.

Signed-off-by: Christoph Hellwig <hch@xxxxxx>

Index: linux-2.6/fs/xfs/xfs_iget.c
--- linux-2.6.orig/fs/xfs/xfs_iget.c    2009-08-10 11:30:55.729724742 -0300
+++ linux-2.6/fs/xfs/xfs_iget.c 2009-08-10 11:40:15.271748324 -0300
@@ -535,17 +535,21 @@ xfs_ireclaim(
        struct xfs_mount        *mp = ip->i_mount;
        struct xfs_perag        *pag;
+       xfs_agino_t             agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
-        * Remove the inode from the per-AG radix tree.  It doesn't matter
-        * if it was never added to it because radix_tree_delete can deal
-        * with that case just fine.
+        * Remove the inode from the per-AG radix tree.
+        *
+        * Because radix_tree_delete won't complain even if the item was never
+        * added to the tree assert that it's been there before to catch
+        * problems with the inode life time early on.
        pag = xfs_get_perag(mp, ip->i_ino);
-       radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino));
+       ASSERT(radix_tree_lookup(&pag->pag_ici_root, agino));
+       radix_tree_delete(&pag->pag_ici_root, agino);
        xfs_put_perag(mp, pag);

<Prev in Thread] Current Thread [Next in Thread>