On Mon, Feb 07, 2011 at 02:55:36PM -0600, Bill Kendall wrote:
> On 02/04/2011 02:49 PM, Dave Chinner wrote:
> >On Fri, Feb 04, 2011 at 09:12:53AM -0500, Michael Lueck wrote:
> >>Dave Chinner wrote:
> >>>Ok, so xfsdump i seeing a short bulkstat, then an EINVAL returned
> >>>from the next bulkstat. That's not a race condition, and makes me
> >>>think you have some kind of on-disk corruption.
> >>Very odd that some kind of on-disk corruption is suddenly causing
> >>xfsdump problems starting with Ubuntu 10.04 (Lucid) kernel
> >>2.6.32-27 and persisting in 2.6.32-28.
> >Not really. The newer kernels have code in them that does more
> >validity checks than previous kernels, so older kernels would have
> >erroneously and silently returned unlinked files to xfsdump and have
> >them backed up. IOWs, you'd never notice such a corruption with
> >xfsdump. On the new kernel, xfsdump gets an EINVAL error to such
> >occurrences, which it should have in the first place.
> >>And there is one other person who confirmed this xfsdump problem
> >>running Lucid with kernel 2.6.32-28. They reported their "me too"
> >>in the Ubuntu bug tracker.
> >>Could it be that 2.6.32-26 and prior managed to write something to
> >>disk corrupted, and the newer code is tripping on it?
> >That's what I'm trying to find out. Or it could be something as
> >simple as your disk has had an undetected bit error that has flipped
> >a bit in the inode allocation btree.
> Hi Dave,
> I am able to reproduce this on a system running Ubuntu 10.4
> (2.6.32-28). I took a metadump of the filesystem and moved it to
> a system running 10.10 (2.6.35-25), and was able to successfully
> dump it there. Likewise it dumps fine on 2.6.38-rc1. So this
> suggests an issue with the Ubuntu 10.4 kernel.
2.6.35 hasn't had the untrusted inode lookup patches back ported to
it, so it's no surprise that it isn't having problems - it's just
like the older 2.6.32 kernels.
Hmmm, can you find out if there is any specific pattern to the inode
numbers that are returning EINVAL? Maybe the inode allocbt freespace
record checks aren't quite correct in the backport (like the
original bogus alignment assumption I made).