[Top] [All Lists]

Re: xfs_repair force_geometry

To: stan@xxxxxxxxxxxxxxxxx
Subject: Re: xfs_repair force_geometry
From: "Michael L. Semon" <mlsemon35@xxxxxxxxx>
Date: Tue, 14 May 2013 13:54:53 -0400
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=bHXlycGt9/15rM4842YKlRnqZDtngzc54ayceVf3S9M=; b=PWbOLsNlJa3FzCqsvWmyDyPw759Gak/A9I02pLzqgdANqKKobWZFbKWWl5FQFxgJ49 HmNB46rK7tusRi77E2HjTdDoL8bVVoWj5c03TQz91qSmW/QiaTFmUHGhTOZuZsl58qBJ UHwrigQCONo15uaGwATVzsiXRNul4ctx+p32n+XBwoasRdH/o1Epyd1AYgixbKxUNAII JA48QFMINo3ywdX/wUaU5u29YcKHahCAKK9Id0WT+QnlmIjb0DvR5iZStu23t1lkGF1Z DmjTj1ybZRE2bl7jPPPyU2aDlleY4L6wIfBxMM9B7D2j2pmPndHb1yffrEochOC8d2sW 9NNg==
In-reply-to: <51922FAC.4000101@xxxxxxxxxxxxxxxxx>
References: <5190DB7F.2050505@xxxxxx> <519165F2.80902@xxxxxxxxxxx> <5191C772.4020607@xxxxxx> <5191ECCD.2070806@xxxxxxxxx> <5191FC34.10105@xxxxxx> <51922FAC.4000101@xxxxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130328 Thunderbird/17.0.5
On 05/14/2013 08:35 AM, Stan Hoeppner wrote:
On 5/14/2013 3:56 AM, Benedikt Schmidt wrote:
I see, I should have mentioned this earlier. I already tried xfs_repair
and it failed to find the second superblock. Because I am still able to
mount the original disk and most parts of it I guessed that xfs_repair
is confused by the different disk geometries. What I have also already
tried out was, naturally, to copy the whole stuff with for example cp or
xfs_copy, but both failed because of filesystem errors. The only program
which didn't fail to copy the data was dd_rescue, which can handle the
errors. That is why I used, as it was my only option (as far as I can see).

You are able to mount the XFS on the original disk which means the
superblocks are apparently intact and the log section isn't corrupt.
But when you attempt to copy files from that XFS to another
disk/filesystem you get what you describe as filesystem errors.  How far
did the cp/xfs_copy progress before you received the filesystem errors?
  What is the result of running xfs_repair -n on the original filesystem?

The point of these questions is to reveal whether the original disk
simply has media surface errors toward the end of the disk where you
wrote those few most recent files, *or* if the problem with the disk is
electrical or mechanical.

The fact that cp/xfs_copy fail, yet ddrescue completes by retrying
(though possibly while ignoring some sectors due to retry limit of 1),
would tend to suggest the problem is electrical or mechanical, not
platter surface defects.  From what you've described so far it sounds
like the more load you put on the drive the more errors it throws.  This
is typical when the internal power supply circuit on a drive is failing.

While the drive is idle, I would suggest you use xfs_db on the original
XFS to locate the positions of those few files that are not backed up.
Unmount the XFS and use dd with skip/seek to copy only these files to
another location.  Do one file at a time to put as little load on the
drive as possible.  Give it some resting time between dd operations.  If
this works it eliminates the need to expand your RAID5 or attempt more
full partition copies to the new 2TB drive.  If this doesn't work, it
also eliminates the need for either of these steps, as it will
demonstrate it's simply not possible to recover the data.

I've been hesitant to suggest using the smartmontools to aid in this quest. In the event of surface errors, `smartctl -a /dev/sdd` may or may not show the exact error locations. The read error rate numbers might be helpful, too. However, smartctl has extra features that might cause SMART to remap sectors that could be read one last time. `smartctl --test=long /dev/sdd` should be a no-no at this point. At any rate, I wouldn't want that SMART initialization clunk noise to be the drive's last dying gasp. Thoughts?


<Prev in Thread] Current Thread [Next in Thread>