xfs
[Top] [All Lists]

Re: xfs filesystem corruption with kernel 2.6.37

To: Kamal Dasu <kdasu.kdev@xxxxxxxxx>
Subject: Re: xfs filesystem corruption with kernel 2.6.37
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Sat, 3 Nov 2012 09:55:09 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <34633803.post@xxxxxxxxxxxxxxx>
References: <CAC=U0a2T_J9Y6WzvWyFfbBSDy__Pr7f4gfQBie2o0VhAm2jCaQ@xxxxxxxxxxxxxx> <20121025224713.GF29378@dastard> <34630253.post@xxxxxxxxxxxxxxx> <20121102012728.GT29378@dastard> <34633803.post@xxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Nov 02, 2012 at 09:34:32AM -0700, Kamal Dasu wrote:
> 
> Dave,
> 
> I see the following. On two different systems with the same want= value.
> Does not seem like a random corruption.
> 
> >attempt to access beyond end of device
> > sda2: rw=0, want=33792081130943048, limit=3147132

0x780dbd80007f248

Once again it's corruption in the upper 32 bits of the 64 bit number.
Those bits should be zero. Perhaps looking at all the trace events
from recovery might give you a closer approximation of where the bad
extent records is found....

> > > Track down where those stray upper bits in the block numbers are
> > > coming from, and you'll have your answer. 
> 
> Where would be the best place to put this check. 

I don't know - you're just just going to have to put cheks
everywhere - from when the extent record is first read from disk, to
where it is modified to where it is backed back into disk format.

> Also on a XFS DEBUG all the asserts seem to be in the unlink (delete path).

No surprise, the failures found when removing open-but-unlinked
files after a crash. i.e. during unlink in the final stage of
recover.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>