rsync and corrupt inodes (was xfs_dump problem)
Dave Chinner
david at fromorbit.com
Thu Jul 15 17:57:13 CDT 2010
On Thu, Jul 15, 2010 at 10:58:15PM +0200, Michael Monnerie wrote:
> Ping?
>
> On Montag, 5. Juli 2010 Dave Chinner wrote:
> > > So far, so good. I'm on 2.6.34 now. Is there any chance for a fixed
> > > version of xfs_repair, so that I can either get rid of the 4 broken
> > > files (i.e. delete them), or repair the filesystem? ATM, xfs_repair
> > > asserts on this filesystem.
> >
> > What version of xfs_repair? v3.1.2 does not assert fail here on the
> > metadump image you posted, but it does take 3 runs to fix up all the
> > problems with the busted inodes....
>
> Do you mean this one?
> http://zmi.at/saturn_bigdata.metadump.only_broken.bz2 (197 MB)
Yes, that's the one.
> I have xfs_repair 3.1.2, and made a shell script which 10x does
> xfs_repair that image, I attached the output here. Doesn't seem to
> repair anything, just crashing.
>
> Maybe I did something wrong? I configured xfsprogs 3.1.2 with
> CFLAGS=-march=athlon64-sse3 ./configure --prefix=/usr
> and then
> make;make install
Drop the CFLAGS and see what happens when you just use a generic
arch target.
> I recompiled the whole thing now with
> # gcc --version
> gcc (SUSE Linux) 4.4.1 [gcc-4_4-branch revision 150839]
$ gcc --version
gcc (Debian 4.4.4-6) 4.4.4
> and it's the same output as ever. Either you meant another metadump, or
> there is a problem somewhere I don't see.
It's the same metadump. The repair output is identical up to phase
4. Then I note that your first run processes AGs out of order and so
detects problems in a different order to when I run locally.
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 2
> - agno = 1
> - agno = 3
> - agno = 4
> - agno = 5
> - agno = 6
> - agno = 7
> data fork in inode 2195133988 claims metadata block 537122652
> xfs_repair: dinode.c:2101: process_inode_data_fork: Assertion `err == 0' failed.
This is where things go out of order. In comparison:
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
data fork in inode 649642 claims metadata block 537266460
correcting nblocks for inode 649642, was 8928025 - counted 8388604
data fork in inode 649790 claims metadata block 537274140
bad attribute format 1 in inode 649790, resetting value
correcting nblocks for inode 649790, was 8928025 - counted 8388604
- agno = 1
data fork in inode 2195133988 claims metadata block 537122652
bad attribute format 1 in inode 2195133988, resetting value
correcting nblocks for inode 2195133988, was 8928025 - counted 8388604
data fork in inode 2902971474 claims metadata block 537036572
correcting nblocks for inode 2902971474, was 8928025 - counted 8388604
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
Phase 5 - rebuild AG headers and trees...
- reset superblock...
....
This is not consistent, though, so it seems there is some problem where
bad attribute format is not being detected and corrected properly
for inode 2195133988. I don't see why AG processing order would
affect this.
Regardless, can you run xfs_repair -P and see if that prevents the assert
failure?
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list