<div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Apr 27, 2012 at 2:31 PM, Michael Monnerie <span dir="ltr"><<a href="mailto:michael.monnerie@is.it-management.at" target="_blank">michael.monnerie@is.it-management.at</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am Donnerstag, 26. April 2012, 13:00:06 schrieb Aaron Williams:<br>
<div class="im">> I was able to recover the filesystem.<br>
<br>
</div>So your RAID busted the filesystem. Maybe the devs could want an<br>
xfs_metadump of the FS before your repair, so they can inspect it and<br>
improve xfs_repair.<br>
<span class="HOEnZb"><font color="#888888"><br></font></span></blockquote><div>Hi Michael,</div><div><br></div><div>It appears that way, or it may be the fact that I mounted with nobarrier
and in the process of recovering the RAID the information in the
battery-backed RAID cache got blown away. I have an Areca ARC-1210 controller that was in the process of rebuilding when I attempted to shut down and reboot my Linux system after I mistakenly unplugged the wrong drive from my RAID array. I had another drive fail on me and it had completed rebuilding itself using a hot spare drive. I intended to remove the bad drive to replace it but disconnected the wrong drive. After reconnecting the good drive it went on to start rebuilding itself again. At this point I decided it might be safer to shut down Linux to replace the drive and thought the RAID controller would pick up where it left off in rebuilding.</div>
<div><br></div><div>Linux did not shut down all the way however. I don't know if it was waiting for the array to rebuild itself or if something else happened. Anyway, I eventually hit the reset button. The RAID BIOS reported it could not find the array and I had to go about rebuilding the array. I also did a volume check which found about 70,000 blocks that it repaired. Needless to say I was quite nervous.</div>
<div><br></div><div>Once that was done Linux refused to mount the XFS partition, I think due to corruption in the log.</div><div><br></div><div>I have an image of my
pre-repaired filesystem by using dd and can try and do a meta dump. The
filesystem is 1.9TB in size with about 1.2TB of data in use.</div>
<br>
It looks like I was able to recover everything fine after blowing away
the log. I see a bunch of files recovered in lost+found but those all
appear to be files like cached web pages, etc.</div><div class="gmail_quote">
<br>
I also dumped the log to a file (128M).<br>
<br>So far it looks like any actual data loss is minimal (thankfully) and
was a good wakeup call to start doing more frequent backups.<br>
<br>
I also upgraded xfsprogs from 3.1.6-2.1.2 to 3.1.8 which did a much better job at recovery than my previous attempt.</div><div class="gmail_quote"><br></div>It would be nice if xfs_db would allow me to continue when the log is dirty instead of requiring me to mount the filesystem first. It also would be nice if xfs_logprint could try and identify the filenames of the inodes involved.<br>
<br><div class="gmail_quote">I understand that there are plans to update XFS to include the UID in all of the on-disk structures. Any idea on when this will happen?<br><br>
-Aaron</div></div>