xfs_check segfault / xfs_repair I/O error

Drew Wareham m3rlin at gmail.com
Thu Apr 19 23:11:54 CDT 2012


Hi Dave / Stan,

Thanks for taking the time to reply.  Unfortunately none of the suggestions
were able to recover the data - I'm going to rebuild the array now, but as
RAID6 for the extra level of security.

Thanks again for all your help!


Cheers,

Drew


On Mon, Apr 16, 2012 at 8:31 AM, Dave Chinner <david at fromorbit.com> wrote:

> On Sun, Apr 15, 2012 at 11:15:09PM +1000, Drew Wareham wrote:
> > Hello Everyone,
> >
> > Hopefully this is the correct kind of information to send to this list.
> >
> > I have an issue with a large XFS volume (17TB) that mounts, but is not
> > readable.  I can view the folder structure on the volume but I can't
> access
> > any of the actual data.  A disk failed in a RAID5 array and while it has
> > rebuilt now, it looks like it's caused serious data integrity issues.
> >
> > Here is the CentOS release / Kernel version:
> >     [root at svr608 ~]# uname -a
> >     Linux svr608 2.6.18-308.1.1.el5 #1 SMP Wed Mar 7 04:16:51 EST 2012
> > x86_64 x86_64 x86_64 GNU/Linux
> >     [root at svr608 ~]# cat /etc/redhat-release
> >     CentOS release 5.8 (Final)
> >     [root at svr608 ~]# cat /tmp/yum.list | grep xfs | grep installed
> >     kmod-xfs.x86_64                            0.4-2
> > installed
> >     xfsdump.x86_64                             2.2.46-1.el5.centos
> > installed
> >     xfsprogs.x86_64                            2.9.4-1.el5.centos
>
> Try upgrading xfsprogs to the latest version first. this is rather
> old, and the latest versions handle IO errors better...
>
> > But even though the volume mounts, when trying to access data it just
> gives
> > a "Structure needs cleaning" error.
> >
> > Running xfs_check and xfs_repair yield the following:
> >     [root at svr608 ~]# xfs_check /dev/cciss/c0d2
> >     bad agf magic # 0x58418706 in ag 0
>
> Oh, that's bad. 2 bytes of the magic number are corrupt...
>
> >     bad agf version # 0x30002 in ag 0
>
> And the version is completely toast.
>
> >     /usr/sbin/xfs_check: line 28:  5259 Segmentation fault
> > xfs_db$DBOPTS -i -p xfs_check -c "check$OPTS" $1
> >     [root at svr608 ~]# xfs_repair -n /dev/cciss/c0d2
> >     Phase 1 - find and verify superblock...
> >     superblock read failed, offset 0, size 524288, ag 0, rval -1
> >
> >     fatal error -- Input/output error
> >
> > And they leave the following in dmesg:
> >     xfs_db[5259]: segfault at 000000000555a134 rip 00000000004070c3 rsp
> > 00007fff986bae50 error 4
> >     cciss 0000:04:00.0: cciss: c ffff810037e00000 has CHECK CONDITION
> sense
> > key = 0x3
>
> This is clearly a raid array error....
>
> ....
>
> > ................
> >     Filesystem cciss/c0d2: XFS internal error xfs_da_do_buf(2) at line
> 2112
> > of file fs/xfs/xfs_da_btree.c.  Caller 0xffffffff8835d9b9
> >
> > hpacucli says the array is fine, but it looks like it's corrupted to me.
>
> It's badly corrupted. Try a newer version of check/repair, otherwise
> you're in a disaster recovery situation...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120420/b8e73aac/attachment.htm>


More information about the xfs mailing list