xfs
[Top] [All Lists]

Re: [PATCH] xfs: fix bad hash ordering

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: fix bad hash ordering
From: Mark Tinguely <tinguely@xxxxxxx>
Date: Mon, 31 Mar 2014 21:22:19 -0500
Cc: XFS Mailing List <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140331214016.GD17603@dastard>
References: <20140328173430.622616177@xxxxxxx> <20140331001055.GD16336@dastard> <53399B06.5010400@xxxxxxx> <20140331214016.GD17603@dastard>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120122 Thunderbird/9.0
On 03/31/14 16:40, Dave Chinner wrote:
On Mon, Mar 31, 2014 at 11:42:46AM -0500, Mark Tinguely wrote:
On 03/30/14 19:10, Dave Chinner wrote:
On Fri, Mar 28, 2014 at 12:33:34PM -0500, Mark Tinguely wrote:
Fix the fix directory "bad hash ordering" bug introduced in
commit f5ea1100.


...


---
A C program that generates this problem can be found at:
  http://oss.sgi.com/archives/xfs/2014-03/msg00373.html

A xfstest for this bug is coming from Hannes Frederic Sowa.

Can you convert this program to an xfstest yourself so that I can
commit the regression test at the same time I commit an updated
fix?

We narrowed the iterations down to make it a quick test.
I have every confidence that Hannes can generate the test in a timely
manner and I will help in any way possible.

Well, it's been over a week now and you're asking me to trust that
someone I don't know and who has never submitted an xfstests before
to do something in a timely manner so we can test a critical bug fix
during a merge window. I'm willing to be pleasently surprised, but
history tells me that people that report bugs rarely follow up with
xfstest cases and it's usually the developer that fixes the bug that
generates the xfstests patch.

So if the xfstests patch doesn't arrive in the next few hours, can
you please do that for us so I can get this sorted out for the merge
window?

Cheers,

Dave.

Dave,

I think we need to take a step back and clear a little confusion here.
There are 2 different directory bugs.

1) Freeing of a already free extent. It presents with the error:
        XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 16XX of file
         fs/xfs/xfs_alloc.c.
   Could be a right or a left edge (or both) that is free.

   Morgan Meyers <Morgan.Mears@xxxxxxxxxx> sent the latest occurrence on
   March 12, but others have been seeing it in the community code in the
   last few mounts. SGI has been seeing it lately with big customers and
   it has occurred off and on for 7-8  years according to our bug
   database.

   It is a nasty bug that can can cause corruption. As I mentioned last
   week in the analysis of Morgan's metadata dump, XFS can allocate the
   same buffer multiple times. In his metadata dump there is a directory
   block and inode clusters that also allocated as user blocks. These
   duplicate allocated blocks are land mines waiting to go off either
   when written to by one owner or when when both allocations are
   removed which causes the XFS_WANT_CORRUPTED_GOTO forced shutdown.

2) Hannes Frederic Sowa found a different directory bug on Thursday,
   March 27. He included a replicator. I bisected the source of the this
   bug on Thursday. Walked the bisected patch on Friday and posted the
   patch. The idea to make a xfstest from the replicator was also made
   on March 28.

   This bug has been only known for 3 business days. I already promised
   that a xfstest will be made. If you need to verify the problem and
   the patch, there already is a replicator.

--Mark.

<Prev in Thread] Current Thread [Next in Thread>