[Top] [All Lists]

Re: Corrupted files

To: Leslie Rhorer <lrhorer@xxxxxxxxxxxx>, Sean Caron <scaron@xxxxxxxxx>
Subject: Re: Corrupted files
From: Sean Caron <scaron@xxxxxxxxx>
Date: Tue, 9 Sep 2014 12:03:56 -0400
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CAA43vkXwHF9RHW-cbTZ91_vF6wiQ6o_+TQDL3=7kD9P4tErCNQ@xxxxxxxxxxxxxx>
References: <540F1B01.3020700@xxxxxxxxxxxx> <CAA43vkXwHF9RHW-cbTZ91_vF6wiQ6o_+TQDL3=7kD9P4tErCNQ@xxxxxxxxxxxxxx>
OK, let me retract just a tiny fraction of what I said originally; thinking about it further, there was _one_ time I was able to use xfs_repair to successfully recover a "lightly bruised" XFS and return it to service. But in that case, the fault was very minor and I always check first with:

xfs_repair [-L] -n -v <filesystem>

and give the output a good looking over before proceeding further.

If it won't run without zeroing the log, you can take that as a sign that things are getting dire.. I wouldn't bother to run xfs_repair "for real" if the trial output looked even slightly non-trivial, in cases of underlying array failure or massive filesystem corruption, and I'd never run it without mounting and scavenging first (unless I had a very recent full backup). Barring rare cases, xfs_repair is bad juju.



On Tue, Sep 9, 2014 at 11:50 AM, Sean Caron <scaron@xxxxxxxxx> wrote:
Hi Leslie,

If you have a full backup, I would STRONGLY recommend just wiping your old filesystem and restoring your backups on top of a totally fresh XFS, rather than repairing the original filesystem and then filling in the blanks with backups using a file-diff tool like rsync.

You will probably hear various opinions here about xfs_repair; my personal opinion is that xfs_repair is a program made available for the unwary to further scramble their data and make a hash of the filesystem... In my first-hand experience managing ~7 PB of XFS storage and growing, I have NEVER found xfs_repair (yes, even the "newest version") to ever do anything positive. It's basically a data scrambler.

At this point, you will never achieve anything near what I'd consider a production-grade, trustworthy data repository. Any further runs of xfs_repair will either do nothing, or make the situation worse. Fortunately you followed best practice and kept backups so you don't really need xfs_repair anyway, right?



P.S. No backups? Still don't even think about running xfs_repair. ESPECIALLY don't think about running xfs_repair. Try mounting ro; if that doesn't work, mount ro with noreplaylog and scavenge what you can. Write off the rest. That's the cost of doing business without backups. Running xfs_repair (especially as a first-line step) will only make it worse, and especially on big filesystems, the run time can extend to weeks... Don't keep your users down any longer than you need to, running a program that won't really help you. Just scavenge it, reformat and turn it back around.

On Tue, Sep 9, 2014 at 11:21 AM, Leslie Rhorer <lrhorer@xxxxxxxxxxxx> wrote:


    I have an issue with my primary RAID array. I have 13T of data on the array, and I suffered a major array failure. I was able to rebuild the array, but some data was lost. Of course I have backups, so after running xfs_repair, I ran an rsync job to recover the lost data. Most of it was recovered, but there are several files that cannot be read, deleted, or overwritten. I have tried running xfs_repair several times, but any attempt to access these files continuously reports "cannot stat XXXXXXXX: Structure needs cleaning". I don't need to try to recover the data directly, as it does reside on the backup, but I need to clear the file structure so I can write the files back to the filesystem. How do I proceed?

xfs mailing list

<Prev in Thread] Current Thread [Next in Thread>