xfsdump SGI_FS_BULKSTAT errno = 22, how could this IRIX bug get into Ubuntu 10.04 Lucid between kernels 2.6.32-27 and 2.6.32-26?
Bill Kendall
wkendall at sgi.com
Mon Feb 7 15:42:28 CST 2011
On 02/07/2011 03:23 PM, Dave Chinner wrote:
> On Mon, Feb 07, 2011 at 02:55:36PM -0600, Bill Kendall wrote:
>> On 02/04/2011 02:49 PM, Dave Chinner wrote:
>>> On Fri, Feb 04, 2011 at 09:12:53AM -0500, Michael Lueck wrote:
>>>> Dave Chinner wrote:
>>>>> Ok, so xfsdump i seeing a short bulkstat, then an EINVAL returned
>>>> >from the next bulkstat. That's not a race condition, and makes me
>>>>> think you have some kind of on-disk corruption.
>>>>
>>>> Very odd that some kind of on-disk corruption is suddenly causing
>>>> xfsdump problems starting with Ubuntu 10.04 (Lucid) kernel
>>>> 2.6.32-27 and persisting in 2.6.32-28.
>>>
>>> Not really. The newer kernels have code in them that does more
>>> validity checks than previous kernels, so older kernels would have
>>> erroneously and silently returned unlinked files to xfsdump and have
>>> them backed up. IOWs, you'd never notice such a corruption with
>>> xfsdump. On the new kernel, xfsdump gets an EINVAL error to such
>>> occurrences, which it should have in the first place.
>>>
>>>> And there is one other person who confirmed this xfsdump problem
>>>> running Lucid with kernel 2.6.32-28. They reported their "me too"
>>>> in the Ubuntu bug tracker.
>>>>
>>>> Could it be that 2.6.32-26 and prior managed to write something to
>>>> disk corrupted, and the newer code is tripping on it?
>>>
>>> That's what I'm trying to find out. Or it could be something as
>>> simple as your disk has had an undetected bit error that has flipped
>>> a bit in the inode allocation btree.
>>>
>>
>> Hi Dave,
>>
>> I am able to reproduce this on a system running Ubuntu 10.4
>> (2.6.32-28). I took a metadump of the filesystem and moved it to
>> a system running 10.10 (2.6.35-25), and was able to successfully
>> dump it there. Likewise it dumps fine on 2.6.38-rc1. So this
>> suggests an issue with the Ubuntu 10.4 kernel.
>
> 2.6.35 hasn't had the untrusted inode lookup patches back ported to
> it, so it's no surprise that it isn't having problems - it's just
> like the older 2.6.32 kernels.
I thought it landed in 2.6.35 and then a regression was fixed in
2.6.36. The untrusted inode lookup changes are referenced here:
http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.35
>
> Hmmm, can you find out if there is any specific pattern to the inode
> numbers that are returning EINVAL? Maybe the inode allocbt freespace
> record checks aren't quite correct in the backport (like the
> original bogus alignment assumption I made).
I'll take a look.
Bill
More information about the xfs
mailing list