The SMART utilities suggestion is a good idea. You can run badblocks on
read-only mode:
# badblocks -sv -b 512 -c 128 /dev/hda
You can try the non-destructive read-write mode, but if there is a hardware
problem (not disk media) it will make things way worse, as it reads,
corrupts, then rewrites.
You could do a destructive read-write test on another disk to check for a
hardware problem first, then do the non-destructive read-write on the real
drive.
I would definitely try the smart utils first. If there is a media defect on
the drive, the drive will know (and tell you via SMART)
Regards,
Jeremy Jackson
----- Original Message -----
From: "Gaspar Bakos" <gbakos@xxxxxxxxxxxxxxx>
To: "Steve Lord" <lord@xxxxxxxx>
Cc: <linux-xfs@xxxxxxxxxxx>
Sent: Thursday, December 04, 2003 1:54 PM
Subject: Re: Kernel panic, SB validate failed
> Hi,
>
> > I'll check the cable, or swap it, and see what
> > happens.
>
> Swapping the cable did not change things, unfortunately.
> The main question now is recovery: how to start gently without loosing
> things - if this is possible at all.
> I remember that the last time I had to run xfs_repair -L, because the
> filesystems were not mountable, and simple xfs_repair suggested used of
> -L. So, I lost a lot of things (not a surprise, but there seemed to be
> no other way to proceed, at least with my knowledge)
>
> Here is the cable situation a bit more detailed:
>
> Originally the failed disk was connected as hda with a cable that indeed
> shows some wear (cable #1). The hdb on the same cable, however, showed no
> problems (also XFS, what else).
>
> After the crash, I replaced the faulty hda to another linux disk to boot
> in, and the faulty disk went in place of hdd, which has a seemingly new
> cable, and with which the previous hdd worked fine. All the xfs_check
> messages I reported were done with this setup, ie. with cable #2.
>
> Then I put back fauly to hda, and changed cable #1 to yet another cable
> (#3), but the same kernel panic happens. Seems like all disks work fine
> with all the cables I have (as the faulty one also used to).
>
> This is off-topic, not directly XFS-related, although would be useful to
> filter out non-XFS problems:
>
> I am thinking if there is a way to investigate if the 'faulty' drive has
> true hw problems. It is recognized at boot in (as hdd), CHS correct,
> hdparm -Ii gives correct response.
> - Is there any meaning of runnig "badblocks" on the device?
> - Or is there an equivalent for this with xfs (badblocks was mainly used
> through e2fsck -c)?
>
> Cheers,
> Gaspar
>
>
|