xfs_repair force_geometry

Michael L. Semon mlsemon35 at gmail.com
Tue May 14 12:54:53 CDT 2013


On 05/14/2013 08:35 AM, Stan Hoeppner wrote:
> On 5/14/2013 3:56 AM, Benedikt Schmidt wrote:
> ...
>> I see, I should have mentioned this earlier. I already tried xfs_repair
>> and it failed to find the second superblock. Because I am still able to
>> mount the original disk and most parts of it I guessed that xfs_repair
>> is confused by the different disk geometries. What I have also already
>> tried out was, naturally, to copy the whole stuff with for example cp or
>> xfs_copy, but both failed because of filesystem errors. The only program
>> which didn't fail to copy the data was dd_rescue, which can handle the
>> errors. That is why I used, as it was my only option (as far as I can see).
>
> You are able to mount the XFS on the original disk which means the
> superblocks are apparently intact and the log section isn't corrupt.
> But when you attempt to copy files from that XFS to another
> disk/filesystem you get what you describe as filesystem errors.  How far
> did the cp/xfs_copy progress before you received the filesystem errors?
>   What is the result of running xfs_repair -n on the original filesystem?
>
> The point of these questions is to reveal whether the original disk
> simply has media surface errors toward the end of the disk where you
> wrote those few most recent files, *or* if the problem with the disk is
> electrical or mechanical.
>
> The fact that cp/xfs_copy fail, yet ddrescue completes by retrying
> (though possibly while ignoring some sectors due to retry limit of 1),
> would tend to suggest the problem is electrical or mechanical, not
> platter surface defects.  From what you've described so far it sounds
> like the more load you put on the drive the more errors it throws.  This
> is typical when the internal power supply circuit on a drive is failing.
>
> While the drive is idle, I would suggest you use xfs_db on the original
> XFS to locate the positions of those few files that are not backed up.
> Unmount the XFS and use dd with skip/seek to copy only these files to
> another location.  Do one file at a time to put as little load on the
> drive as possible.  Give it some resting time between dd operations.  If
> this works it eliminates the need to expand your RAID5 or attempt more
> full partition copies to the new 2TB drive.  If this doesn't work, it
> also eliminates the need for either of these steps, as it will
> demonstrate it's simply not possible to recover the data.
>

I've been hesitant to suggest using the smartmontools to aid in this 
quest.  In the event of surface errors, `smartctl -a /dev/sdd` may or 
may not show the exact error locations.  The read error rate numbers 
might be helpful, too.  However, smartctl has extra features that might 
cause SMART to remap sectors that could be read one last time. `smartctl 
--test=long /dev/sdd` should be a no-no at this point.  At any rate, I 
wouldn't want that SMART initialization clunk noise to be the drive's 
last dying gasp.  Thoughts?

Michael



More information about the xfs mailing list