xfs
[Top] [All Lists]

Re: [PATCH] xfs: v2 fix node forward in xfs_node_toosmall

To: Mark Tinguely <tinguely@xxxxxxx>
Subject: Re: [PATCH] xfs: v2 fix node forward in xfs_node_toosmall
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 24 Sep 2013 09:48:19 +1000
Cc: xfs@xxxxxxxxxxx, stable@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130923171911.273669684@xxxxxxx>
References: <20130920220519.585903357@xxxxxxx> <20130923171911.273669684@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Sep 23, 2013 at 12:18:58PM -0500, Mark Tinguely wrote:
> Commit f5ea1100 cleans up the disk to host conversions for
> node directory entries, but because a variable is reused in
> xfs_node_toosmall() the next node is not correctly found.
> If the original node is small enough (<= 3/8 of the node size),
> this change may incorrectly cause a node collapse when it should
> not.

The comment about the size of the node triggering a collapse is
irrelevant - nodes always collapse at that given size. What this
doesn't tell us is why the crash occurs - "the next node is not
correctly found" is not particularly obvious, and would require
quite a bit of code reading to work out from first principles a
couple of years down the track.

The commit message should be more precise and describe what the
underlying cause of the failure was. i.e. that the node is finding itself as 
the merge
candidate because we go forward, overwrite the pointers and the new
block's backward sibling is the original block which is where we end
up on teh second loop.  And vice versa if we go backwards first...

Also, the "next node" is correctly termed a "sibling", and it's
either the forwards or backwards sibling, not the "next" sibling as
the direction of movement is important. So perhaps this
is better written as:

"When a node is considered for a merge with a sibling, it overwrites
the sibling pointers of the original node with the sibling's
pointers. This leads to loop considering the original node as a
merge candidate with itself in the second pass, and so it
incorrectly determines a merge should occur."

> That will cause an assert in xfstest generic/319:
> 
>    Assertion failed: first <= last && last < BBTOB(bp->b_length),
>    file: /root/newest/xfs/fs/xfs/xfs_trans_buf.c, line: 569
> 
> Keep the original node header to get the correct forward node.
> 
> Signed-off-by: Mark Tinguely <tinguely@xxxxxxx>
> ---
>  v2 -> Dave's local variable approach.
>     -> send to -stable this bug is in 3.10 and 3.11

The patch title not include "v2". That goes in the "[PATCH v2]"
prefix that gets stripped before commit so that commit messages
don't have patch version numbers in them....

Otherwise the code looks good.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>