[Top] [All Lists]

Re: [PATCH] xfs: v2 fix node forward in xfs_node_toosmall

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: v2 fix node forward in xfs_node_toosmall
From: Mark Tinguely <tinguely@xxxxxxx>
Date: Tue, 24 Sep 2013 12:35:44 -0500
Cc: stable@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130923234819.GW9901@dastard>
References: <20130920220519.585903357@xxxxxxx> <20130923171911.273669684@xxxxxxx> <20130923234819.GW9901@dastard>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120122 Thunderbird/9.0
On 09/23/13 18:48, Dave Chinner wrote:
On Mon, Sep 23, 2013 at 12:18:58PM -0500, Mark Tinguely wrote:
Commit f5ea1100 cleans up the disk to host conversions for
node directory entries, but because a variable is reused in
xfs_node_toosmall() the next node is not correctly found.
If the original node is small enough (<= 3/8 of the node size),
this change may incorrectly cause a node collapse when it should

The comment about the size of the node triggering a collapse is
irrelevant - nodes always collapse at that given size. What this
doesn't tell us is why the crash occurs - "the next node is not
correctly found" is not particularly obvious, and would require
quite a bit of code reading to work out from first principles a
couple of years down the track.

The commit message should be more precise and describe what the
underlying cause of the failure was. i.e. that the node is finding itself as 
the merge
candidate because we go forward, overwrite the pointers and the new
block's backward sibling is the original block which is where we end
up on teh second loop.  And vice versa if we go backwards first...

Also, the "next node" is correctly termed a "sibling", and it's
either the forwards or backwards sibling, not the "next" sibling as
the direction of movement is important. So perhaps this
is better written as:

"When a node is considered for a merge with a sibling, it overwrites
the sibling pointers of the original node with the sibling's
pointers. This leads to loop considering the original node as a
merge candidate with itself in the second pass, and so it
incorrectly determines a merge should occur."

Are you done ranting? Get the @#$% bug patched.


<Prev in Thread] Current Thread [Next in Thread>