Corrupted files

Sean Caron scaron at umich.edu
Wed Sep 10 09:49:32 CDT 2014


I don't want to bloviate too much and drag this completely off topic esp.
since the OPs query is resolved but please allow me just one anecdote :)

Earlier this year, I had one of our project file servers (450 TB) go down.
It didn't go down because the array spuriously just lost a bunch of disks;
it was simply your usual sort of Linux kernel panic... you go to the
console and it's just black screen and unresponsive, or maybe you can see
the tail end of a backtrace and it's unresponsive. So, OK, issue a quick
remote IPMI reboot of the machine, it comes up...

I'm in single user mode, bringing up each sub-RAID6 in our RAID60 by hand,
no problem. Bring up the top level RAID0. OK. Then I go to mount the XFS...
no go. Apparently the log somehow got corrupted in the crash?

So I try to mount ro, no dice, but I _can_ mount ro,noreplaylog and I see
good files here! Thank goodness. I start scavenging to a spare host...

A few weeks later, after the scavenge is done, I did a few xfs_repair runs
just for the sake of experimentation. Using both in dry run mode, I tried
the version that shipped with Ubuntu 12.04, as well as the latest
xfs_repair I could pull from the source tree. I redirected the output of
both runs to file and watched them with 'tail -f'.

Diffing the output when they were done, it didn't look like they were
behaving much differently. Both files had thousands or tens of thousands of
lines worth of output in them, bad this, bad that... (I always run in
verbose mode) Since the filesystem was hosed anyway and I was going to
rebuild it, I decided to let the new xfs_repair run "for real" just to see
what would happen, for kicks. And who knows? Maybe I could recover even
more than I already had ...? (I wasn't just totally wasting time)

I think it took maybe a week for it to run on a 450 TB volume? At least a
week. Maybe I was being a teensy bit hyperbolic in my previous descriptions
of runtime, LOL. After it was done?

... almost everything was obliterated. I had tens of millions of
zero-length files, and tens of millions of bits of anonymous scrambled junk
in lost+found.

So, I chuckled a bit (thankful for my hard-won previous experience) before
reformatting the array and then copied back the results of my scavenging.
Just by ro-mounting and copying what I could, I was able to save around 90%
of the data by volume on the array (it was a little more than half full
when it failed... ~290 TB? There was only ~30 TB that I couldn't salvage);
good clean files that passed validation from their respective users. I
think 80-90% recovery rates are very commonly achievable just mounting
ro,noreplaylog and getting what you can with cp -R or rsync, given that
there wasn't grievous failure of the underlying storage system.

If I had depended on xfs_repair, or blithely run it as a first line of
response as the documentation might intimate (hey, it's called xfs_repair,
right?) like you would casually think to do; run it like people run fsck or
CHKDSK... I would have been hosed, big time.

Best,

Sean







On Wed, Sep 10, 2014 at 10:24 AM, Emmanuel Florac <eflorac at intellique.com>
wrote:

> Le Tue, 09 Sep 2014 20:31:07 -0500
> Leslie Rhorer <lrhorer at mygrande.net> écrivait:
>
> > More
> > importantly, is there some reason 3.1.7 would make things worse while
> > 3.2.1 would not?  If not, then I can always try 3.1.7 and then try
> > 3.2.1 if that does not help.
>
> I don't know for these particular versions, however in the past
> I've confirmed that a later version of xfs_repair performed way better
> (salvaged more files from lost+found, in particular).
>
> At some point in the distant past, some versions of xfs_repair were
> buggy and would happily throw away TB of perfectly sane data... Ih ad
> this very problem once on Christmas eve in 2005 IIRC :/
>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |   <eflorac at intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20140910/b9b4fed2/attachment-0001.html>


More information about the xfs mailing list