On Sat, Jul 17, 2004 at 04:05:16PM +0200, Jan Banan wrote:
> I suppose the best stradegy is to get a new disk of the same size
> and then try to copy the whole damaged disk with "dd" to the new
> disk and then try to startup the raid again and after that run
> xfs_repair.
that sounds like a good solution if most of the damaged disk is
readable (i assumed it was completely dead)
> What arguments to "dd" would fit best in this case? I think I've
> read that "dd" will normally abort when it can't read from a damaged
> disk and the disk is quite big, 250 GB (Maxtor).
'conv=noerror' i guess, see the man dd page
> Since it is a 4 disk linear raid I hope most of the files are not
> spread over blocks on different disks since I suppose XFS (1.2.0)
> tries to store the files on blocks close to each other(?).
the file-blocks will *usually* be close together and usually within
the same ag
various access patterns can change this though (like writing with a
very full fs)
> Anyone knows what normally has happened to a disk when you suddenly
> can not read from some parts of the disk? I get these kind of
> errors:
> Jul 15 21:18:58 d kernel: hdh: dma_intr: error=0x40 {
> UncorrectableError }, LBAsect=243818407, high=14, low=8937383,
> sector=243818336
disk media error, if there are only a few of these i would stomp over
them (if and there aren't many relocated sectors) in the hopes the
disk will remap them --- i've done this myself with good results and
help various other people to this
> Can I do something to make it better? The disk is only one year old
> but maybe the temperature has been a little bit to high in the
> computer box.
smartctl -a /dev/<disk>
will tell you how man relocated sectors there are and various other
details. like i said, if there relocated sector count is low and you
don't have *that* bad bad sectors on the disk (badblocks will tell you
this) i would write over the bad-blocks (keeping a record of which
blocks were bad), hope the disk relocates those sectors sanely and
then run xs_repair to see how well that does. if you know which
sectors (well blocks) were bad you can work out which files (well,
parts of files were damaged)
maybe i should write something up on this?
--cw
|