xfs
[Top] [All Lists]

Re: xfs and raid5 - "Structure needs cleaning for directory open"

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: xfs and raid5 - "Structure needs cleaning for directory open"
From: Doug Ledford <dledford@xxxxxxxxxx>
Date: Mon, 17 May 2010 17:28:30 -0400
Cc: Rainer Fuegenstein <rfu@xxxxxxxxxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, linux-raid@xxxxxxxxxxxxxxx
In-reply-to: <20100510022033.GB7165@dastard>
Openpgp: id=CFBFF194
Organization: Red Hat, Inc.
References: <20100510022033.GB7165@dastard>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100413 Fedora/3.0.4-2.fc13 Lightning/1.0b2pre Thunderbird/3.0.4
On 05/09/2010 10:20 PM, Dave Chinner wrote:
> On Sun, May 09, 2010 at 08:48:00PM +0200, Rainer Fuegenstein wrote:
>>
>> today in the morning some daemon processes terminated because of
>> errors in the xfs file system on top of a software raid5, consisting
>> of 4*1.5TB WD caviar green SATA disks.
> 
> Reminds me of a recent(-ish) md/dm readahead cancellation fix - that
> would fit the symptoms of (btree corruption showing up under heavy IO
> load but no corruption on disk. However, I can't seem to find any
> references to it at the moment (can't remember the bug title), but
> perhaps your distro doesn't have the fix in it?
> 
> Cheers,
> 
> Dave.

That sounds plausible, as does hardware error.  A memory bit flip under
heavy load would cause the in memory data to be corrupt while the on
disk data is good.  By waiting to check it until later, the bad memory
was flushed at some point and when the data was reloaded it came in ok
this time.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

Attachment: signature.asc
Description: OpenPGP digital signature

<Prev in Thread] Current Thread [Next in Thread>