xfs
[Top] [All Lists]

Re: xfsdump SGI_FS_BULKSTAT errno = 22, how could this IRIX bug get into

To: Bill Kendall <wkendall@xxxxxxx>
Subject: Re: xfsdump SGI_FS_BULKSTAT errno = 22, how could this IRIX bug get into Ubuntu 10.04 Lucid between kernels 2.6.32-27 and 2.6.32-26?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 8 Feb 2011 09:04:21 +1100
Cc: Michael Lueck <mlueck@xxxxxxxxxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx, Dann Frazier <dannf@xxxxxxxxxx>
In-reply-to: <4D506744.9010303@xxxxxxx>
References: <iibmah$dlp$1@xxxxxxxxxxxxxxx> <4D49A35B.6030009@xxxxxxx> <20110203045836.GV11040@dastard> <4D4ABEF7.7000400@xxxxxxxxxxxxxxxxxxxx> <20110204000823.GW11040@dastard> <4D4C0965.9010905@xxxxxxxxxxxxxxxxxxxx> <20110204204927.GZ11040@dastard> <4D505C48.8050203@xxxxxxx> <20110207212320.GC2559@dastard> <4D506744.9010303@xxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Mon, Feb 07, 2011 at 03:42:28PM -0600, Bill Kendall wrote:
> On 02/07/2011 03:23 PM, Dave Chinner wrote:
> >On Mon, Feb 07, 2011 at 02:55:36PM -0600, Bill Kendall wrote:
> >>On 02/04/2011 02:49 PM, Dave Chinner wrote:
> >>>On Fri, Feb 04, 2011 at 09:12:53AM -0500, Michael Lueck wrote:
> >>>>Dave Chinner wrote:
> >>>>>Ok, so xfsdump i seeing a short bulkstat, then an EINVAL returned
> >>>>>from the next bulkstat. That's not a race condition, and makes me
> >>>>>think you have some kind of on-disk corruption.
> >>>>
> >>>>Very odd that some kind of on-disk corruption is suddenly causing
> >>>>xfsdump problems starting with Ubuntu 10.04 (Lucid) kernel
> >>>>2.6.32-27 and persisting in 2.6.32-28.
> >>>
> >>>Not really. The newer kernels have code in them that does more
> >>>validity checks than previous kernels, so older kernels would have
> >>>erroneously and silently returned unlinked files to xfsdump and have
> >>>them backed up. IOWs, you'd never notice such a corruption with
> >>>xfsdump. On the new kernel, xfsdump gets an EINVAL error to such
> >>>occurrences, which it should have in the first place.
> >>>
> >>>>And there is one other person who confirmed this xfsdump problem
> >>>>running Lucid with kernel 2.6.32-28. They reported their "me too"
> >>>>in the Ubuntu bug tracker.
> >>>>
> >>>>Could it be that 2.6.32-26 and prior managed to write something to
> >>>>disk corrupted, and the newer code is tripping on it?
> >>>
> >>>That's what I'm trying to find out. Or it could be something as
> >>>simple as your disk has had an undetected bit error that has flipped
> >>>a bit in the inode allocation btree.
> >>>
> >>
> >>Hi Dave,
> >>
> >>I am able to reproduce this on a system running Ubuntu 10.4
> >>(2.6.32-28). I took a metadump of the filesystem and moved it to
> >>a system running 10.10 (2.6.35-25), and was able to successfully
> >>dump it there. Likewise it dumps fine on 2.6.38-rc1. So this
> >>suggests an issue with the Ubuntu 10.4 kernel.
> >
> >2.6.35 hasn't had the untrusted inode lookup patches back ported to
> >it, so it's no surprise that it isn't having problems - it's just
> >like the older 2.6.32 kernels.
> 
> I thought it landed in 2.6.35 and then a regression was fixed in
> 2.6.36. The untrusted inode lookup changes are referenced here:
> http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.35

My bad, I just checked the regression fix. I have no idea if it got
back ported to 2.6.35-stable or not - it probably didn't judging by
your results.....

> >Hmmm, can you find out if there is any specific pattern to the inode
> >numbers that are returning EINVAL? Maybe the inode allocbt freespace
> >record checks aren't quite correct in the backport (like the
> >original bogus alignment assumption I made).
> 
> I'll take a look.

Thanks.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>