xfs
[Top] [All Lists]

Re: Harddrive error and XFS corruption

To: Marcus Hast <hast@xxxxxxxxx>
Subject: Re: Harddrive error and XFS corruption
From: Steve Lord <lord@xxxxxxx>
Date: 13 Nov 2001 08:54:30 -0600
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <20011113143336109.AAA296.51@e414.mhk.lu.se>
References: <20011113143336109.AAA296.51@e414.mhk.lu.se>
Sender: owner-linux-xfs@xxxxxxxxxxx
Unfortunately, without some form of backup, there is not really a lot
which can be done with this filesystem, you could try replacing the
failed drive and running xfs_repair, but I doubt it will end up with
anything which mounts.

The only way to protect against this type of thing is to have backups
of your data - either in dump format, or using some form of raid
mirroring or parity protection.

Steve


On Tue, 2001-11-13 at 08:34, Marcus Hast wrote:
> Hi all,
> I have 3 disks in a LVM volume with XFS on it. After a recent powerfailiure it
> no longer came up. At first it would try to do a recovery and get a lot of:
> 
> hdg: dma_intr: status=0x51 { DriveReady SeekComplete Error } 
> hdg: dma_intr: error=0x40 { UncorrectableError }, LBAsect=134840697,
> sector=134840696 
> end_request: I/O error, dev 22:01 (hdg), sector 134840696
> 
> As I go through the log now however I see some new errors:
> 
> hdg: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } 
> hdg: read_intr: error=0x40 { UncorrectableError }, LBAsect=134840697,
> sector=134840696 
> end_request: I/O error, dev 22:01 (hdg), sector 134840696
> 
> I take it that this means it has gone worse. (read_intr error instead of
> dma_intr which I have seen is quite common.) This is on a LVM volume with 220G
> of data. 
> 
> So I have a few questions:
> Is there any way of getting the data on the other disks back? From what I've
> seen of the logs it's hdg that's bad.
> 
> Is there any way of getting warned about this before it happens? I did get a
> lot of dma_intr errors first, but it seemed to me then that a lot of other
> people were getting them and safely (?) ignoring them. (From the kernel and
> LVM
> lists.) 
> 
> Is there any way I can be "proactive" in avoiding this? By storing metadata
> redundantly for instance? (I assume that in this particular case it's those
> parts of the drive which has gone, which is why I'm left with an unmount and
> unrecoverable system.)
> 
> Would a check with for instance Bonnie catch a problem like this before it
> gets
> bad?
> 
> I've seen this in a couple of places now, perhaps it would be a good idea to
> put it in the FAQ or some documents?
> 
> Marcus Hast, Lund, Sweden, Earth.
> Living long and prosperous.
-- 

Steve Lord                                      voice: +1-651-683-3511
Principal Engineer, Filesystem Software         email: lord@xxxxxxx


<Prev in Thread] Current Thread [Next in Thread>