Hi Dave / Stan,<br><br>Thanks for taking the time to reply. Unfortunately none of the suggestions were able to recover the data - I'm going to rebuild the array now, but as RAID6 for the extra level of security.<br><br>
Thanks again for all your help!<br><br clear="all"><br>Cheers,<br><br>Drew<br>
<br><br><div class="gmail_quote">On Mon, Apr 16, 2012 at 8:31 AM, Dave Chinner <span dir="ltr"><<a href="mailto:david@fromorbit.com">david@fromorbit.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On Sun, Apr 15, 2012 at 11:15:09PM +1000, Drew Wareham wrote:<br>
> Hello Everyone,<br>
><br>
> Hopefully this is the correct kind of information to send to this list.<br>
><br>
> I have an issue with a large XFS volume (17TB) that mounts, but is not<br>
> readable. I can view the folder structure on the volume but I can't access<br>
> any of the actual data. A disk failed in a RAID5 array and while it has<br>
> rebuilt now, it looks like it's caused serious data integrity issues.<br>
><br>
> Here is the CentOS release / Kernel version:<br>
> [root@svr608 ~]# uname -a<br>
> Linux svr608 2.6.18-308.1.1.el5 #1 SMP Wed Mar 7 04:16:51 EST 2012<br>
> x86_64 x86_64 x86_64 GNU/Linux<br>
> [root@svr608 ~]# cat /etc/redhat-release<br>
> CentOS release 5.8 (Final)<br>
> [root@svr608 ~]# cat /tmp/yum.list | grep xfs | grep installed<br>
> kmod-xfs.x86_64 0.4-2<br>
> installed<br>
> xfsdump.x86_64 2.2.46-1.el5.centos<br>
> installed<br>
> xfsprogs.x86_64 2.9.4-1.el5.centos<br>
<br>
</div>Try upgrading xfsprogs to the latest version first. this is rather<br>
old, and the latest versions handle IO errors better...<br>
<div class="im"><br>
> But even though the volume mounts, when trying to access data it just gives<br>
> a "Structure needs cleaning" error.<br>
><br>
> Running xfs_check and xfs_repair yield the following:<br>
> [root@svr608 ~]# xfs_check /dev/cciss/c0d2<br>
> bad agf magic # 0x58418706 in ag 0<br>
<br>
</div>Oh, that's bad. 2 bytes of the magic number are corrupt...<br>
<div class="im"><br>
> bad agf version # 0x30002 in ag 0<br>
<br>
</div>And the version is completely toast.<br>
<div class="im"><br>
> /usr/sbin/xfs_check: line 28: 5259 Segmentation fault<br>
> xfs_db$DBOPTS -i -p xfs_check -c "check$OPTS" $1<br>
> [root@svr608 ~]# xfs_repair -n /dev/cciss/c0d2<br>
> Phase 1 - find and verify superblock...<br>
> superblock read failed, offset 0, size 524288, ag 0, rval -1<br>
><br>
> fatal error -- Input/output error<br>
><br>
> And they leave the following in dmesg:<br>
> xfs_db[5259]: segfault at 000000000555a134 rip 00000000004070c3 rsp<br>
> 00007fff986bae50 error 4<br>
> cciss 0000:04:00.0: cciss: c ffff810037e00000 has CHECK CONDITION sense<br>
> key = 0x3<br>
<br>
</div>This is clearly a raid array error....<br>
<br>
....<br>
<div class="im"><br>
> ................<br>
> Filesystem cciss/c0d2: XFS internal error xfs_da_do_buf(2) at line 2112<br>
> of file fs/xfs/xfs_da_btree.c. Caller 0xffffffff8835d9b9<br>
><br>
> hpacucli says the array is fine, but it looks like it's corrupted to me.<br>
<br>
</div>It's badly corrupted. Try a newer version of check/repair, otherwise<br>
you're in a disaster recovery situation...<br>
<br>
Cheers,<br>
<br>
Dave.<br>
<span class="HOEnZb"><font color="#888888">--<br>
Dave Chinner<br>
<a href="mailto:david@fromorbit.com">david@fromorbit.com</a><br>
</font></span></blockquote></div><br>