xfs
[Top] [All Lists]

Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_bt

To: Richard Ems <richard.ems@xxxxxxxxxxxxxxxxx>
Subject: Re: xfs_repair v3.1.6 - Segmentation fault AND XFS internal error xfs_btree_check_sblock
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 18 Oct 2011 09:52:23 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4E9C3EEF.5080609@xxxxxxxxxxxxxxxxx>
References: <4E9C3EEF.5080609@xxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Oct 17, 2011 at 04:42:55PM +0200, Richard Ems wrote:
> Hi all !
> 
> We have a XFS that started giving errors some days ago.  This is
> on an openSUSE 11.4 64 bit system.  The XFS is 12 TB big, 9.8 TB
> are used. Hardware RAID 6 on an Areca 1680 controller.
> 
> Mounting the XFS with ro,norecovery works almost always.
> 
> But xfs_repair crashes with a Segmentation fault. I tried both v3.1.4 from 
> openSUSE 11.4 and xfs_repair v3.1.6 downloaded from the git repo.

Ok, so not a new issue.

> Now after a reboot - the ___production___system completely freezed while 
> running 
> the last xfs_repair v3.1.6 !!! - the XFS got mounted rw, but just trying to 
> touch a file generated the following error:
> 
> Oct 17 16:33:02 c3m kernel: [  794.628715] Filesystem "sdb1": XFS internal 
> error xfs_btree_check_sblock at line 120 of file 
> /usr/src/packages/BUILD/kernel-default-2.6.37.6/linux-2.6.37/fs/xfs/xfs_btree.c.
>   Caller 0xffffffffa0376cbe
> Oct 17 16:33:02 c3m kernel: [  794.628718] 
> Oct 17 16:33:02 c3m kernel: [  794.628722] Pid: 9066, comm: touch Not tainted 
> 2.6.37.6-0.7-default #1
> Oct 17 16:33:02 c3m kernel: [  794.628724] Call Trace:
> Oct 17 16:33:02 c3m kernel: [  794.628737]  [<ffffffff81005819>] 
> dump_trace+0x69/0x2e0
> Oct 17 16:33:02 c3m kernel: [  794.628744]  [<ffffffff814ba5c3>] 
> dump_stack+0x69/0x6f
> Oct 17 16:33:02 c3m kernel: [  794.628776]  [<ffffffffa0376666>] 
> xfs_btree_check_sblock+0x86/0x120 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.628864]  [<ffffffffa0376cbe>] 
> xfs_btree_read_buf_block.clone.0+0x9e/0xc0 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.628947]  [<ffffffffa0378a3e>] 
> xfs_btree_increment+0x1ee/0x290 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.629036]  [<ffffffffa038e522>] 
> xfs_dialloc+0x5e2/0x900 [xfs]
> Oct 17 16:33:02 c3m kernel: [  794.629148]  [<ffffffffa0390bd5>] 
> xfs_ialloc+0x75/0x6d0 [xfs]

A corrupt inode allocation btree - not a particularly common type of
corruption to be reported. Do you know what caused the errors to
start being reported? A crash, a bad disk, a raid rebuild, something
else?  That information always helps us understand how badly damaged
the filesystem might be....

> The last lines before the " xfs_repair -n -P /dev/sdb1 " Segmentation fault 
> where:
> 
> would clear forw/back pointers in block 0 for attributes in inode 4319273
> bad attribute leaf magic # 0x250 for dir ino 4319273
> problem with attribute contents in inode 4319273
> would clear attr fork
> bad nblocks 2 for inode 4319273, would reset to 1
> bad anextents 1 for inode 4319273, would reset to 0
> -bash: line 5:  6488 Segmentation fault      
> /opt/xfsprogs-3.1.6/sbin/xfs_repair -n -P /dev/sdb1

And I'd guess that is failing on a different problem - a corrupt
inode most likely. You've build xfs-repair from the source code -
can yo urun it under gdb so we can see where it is dying?

> The complete " xfs_repair -n -P /dev/sdb1 " output file is 1.2 MB
> gzipped. If anyone wants to have a look at it please ask and I
> will send it as a private mail.

That sounds like there's a *lot* of damage to the filesystem. That
makes it even more important that we understand what caused the
damage in the first place....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>