On Mon, Oct 17, 2011 at 04:42:55PM +0200, Richard Ems wrote:
> Hi all !
> We have a XFS that started giving errors some days ago. This is
> on an openSUSE 11.4 64 bit system. The XFS is 12 TB big, 9.8 TB
> are used. Hardware RAID 6 on an Areca 1680 controller.
> Mounting the XFS with ro,norecovery works almost always.
> But xfs_repair crashes with a Segmentation fault. I tried both v3.1.4 from
> openSUSE 11.4 and xfs_repair v3.1.6 downloaded from the git repo.
Ok, so not a new issue.
> Now after a reboot - the ___production___system completely freezed while
> the last xfs_repair v3.1.6 !!! - the XFS got mounted rw, but just trying to
> touch a file generated the following error:
> Oct 17 16:33:02 c3m kernel: [ 794.628715] Filesystem "sdb1": XFS internal
> error xfs_btree_check_sblock at line 120 of file
> Caller 0xffffffffa0376cbe
> Oct 17 16:33:02 c3m kernel: [ 794.628718]
> Oct 17 16:33:02 c3m kernel: [ 794.628722] Pid: 9066, comm: touch Not tainted
> 220.127.116.11-0.7-default #1
> Oct 17 16:33:02 c3m kernel: [ 794.628724] Call Trace:
> Oct 17 16:33:02 c3m kernel: [ 794.628737] [<ffffffff81005819>]
> Oct 17 16:33:02 c3m kernel: [ 794.628744] [<ffffffff814ba5c3>]
> Oct 17 16:33:02 c3m kernel: [ 794.628776] [<ffffffffa0376666>]
> xfs_btree_check_sblock+0x86/0x120 [xfs]
> Oct 17 16:33:02 c3m kernel: [ 794.628864] [<ffffffffa0376cbe>]
> xfs_btree_read_buf_block.clone.0+0x9e/0xc0 [xfs]
> Oct 17 16:33:02 c3m kernel: [ 794.628947] [<ffffffffa0378a3e>]
> xfs_btree_increment+0x1ee/0x290 [xfs]
> Oct 17 16:33:02 c3m kernel: [ 794.629036] [<ffffffffa038e522>]
> xfs_dialloc+0x5e2/0x900 [xfs]
> Oct 17 16:33:02 c3m kernel: [ 794.629148] [<ffffffffa0390bd5>]
> xfs_ialloc+0x75/0x6d0 [xfs]
A corrupt inode allocation btree - not a particularly common type of
corruption to be reported. Do you know what caused the errors to
start being reported? A crash, a bad disk, a raid rebuild, something
else? That information always helps us understand how badly damaged
the filesystem might be....
> The last lines before the " xfs_repair -n -P /dev/sdb1 " Segmentation fault
> would clear forw/back pointers in block 0 for attributes in inode 4319273
> bad attribute leaf magic # 0x250 for dir ino 4319273
> problem with attribute contents in inode 4319273
> would clear attr fork
> bad nblocks 2 for inode 4319273, would reset to 1
> bad anextents 1 for inode 4319273, would reset to 0
> -bash: line 5: 6488 Segmentation fault
> /opt/xfsprogs-3.1.6/sbin/xfs_repair -n -P /dev/sdb1
And I'd guess that is failing on a different problem - a corrupt
inode most likely. You've build xfs-repair from the source code -
can yo urun it under gdb so we can see where it is dying?
> The complete " xfs_repair -n -P /dev/sdb1 " output file is 1.2 MB
> gzipped. If anyone wants to have a look at it please ask and I
> will send it as a private mail.
That sounds like there's a *lot* of damage to the filesystem. That
makes it even more important that we understand what caused the
damage in the first place....