xfs
[Top] [All Lists]

Re: [PATCH] stable: restart busy extent search after node removal

To: Eric Sandeen <sandeen@xxxxxxxxxx>
Subject: Re: [PATCH] stable: restart busy extent search after node removal
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 13 Jul 2011 10:20:22 +1000
Cc: xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <4E1CE35B.4010404@xxxxxxxxxx>
References: <4E1CC4BA.1010107@xxxxxxxxxx> <20110713001234.GN23038@dastard> <4E1CE35B.4010404@xxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Jul 12, 2011 at 07:14:19PM -0500, Eric Sandeen wrote:
> On 7/12/11 7:12 PM, Dave Chinner wrote:
> > On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote:
> >> Sending this for review prior to stable submission...
> >>
> >> A user on #xfs reported that a log replay was oopsing in
> >> __rb_rotate_left() with a null pointer deref.
> >>
> >> I traced this down to the fact that in xfs_alloc_busy_insert(),
> >> we erased a node with rb_erase() when the new node overlapped,
> >> but left it specified as the parent node for the new insertion.
> >>
> >> So when we try to insert a new node with an erased node as
> >> its parent, obviously things go very wrong.
> >>
> >> Upstream,
> >> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking
> >> actually fixed this, but as part of a much larger change.  Here's
> >> the relevant bit:
> >>
> >>                 * We also need to restart the busy extent search from the
> >>                 * tree root, because erasing the node can rearrange the
> >>                 * tree topology.
> >>                 */
> >>                rb_erase(&busyp->rb_node, &pag->pagb_tree);
> >>                busyp->length = 0;
> >>                return false;
> >>
> >> We can do essentially the same thing to older codebases by restarting
> >> the search after the erase.
> >>
> >> This should apply to .35 through .39, and was tested on .39
> >> with the oopsing replay reproducer.
> >>
> >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> >> ---
> >>
> >> Index: linux-2.6/fs/xfs/xfs_alloc.c
> >> ===================================================================
> >> --- linux-2.6.orig/fs/xfs/xfs_alloc.c
> >> +++ linux-2.6/fs/xfs/xfs_alloc.c
> >> @@ -2664,6 +2664,12 @@ restart:
> >>                                    new->bno + new->length) -
> >>                            min(busyp->bno, new->bno);
> >>            new->bno = min(busyp->bno, new->bno);
> >> +          /*
> >> +           * Start the search over from the tree root, because
> >> +           * erasing the node can rearrange the tree topology.
> >> +           */
> >> +          spin_unlock(&pag->pagb_lock);
> >> +          goto restart;
> >>    } else
> >>            busyp = NULL;
> > 
> > Looks good.
> > 
> > I'm guessing that the only case I was able to hit during testing of
> > this code originally was the "overlap with exact start block match",
> > otherwise I would have seen this. I'm not sure that there really is
> > much we can do to improve the test coverage of this code, though.
> > Hell, just measuring our test coverage so we know what we aren't
> > testing would probably be a good start. :/
> 
> Apparently the original oops, and the subsequent replay oopses,
> were on a filesystem VERY busy with torrents.
> 
> Might be a testcase ;)

That just means large files. And fragmentation levels are
effectively dependent on whether the torrent client uses
preallocation or not. Just creating a set of large fragmented file
using preallocation, shutting the filesystem down in the middle
of it and then doing log replay might do the trick...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>