On Mon, Nov 05, 2001 at 11:21:02AM -0600,
Steve Lord wrote:
> What you should actually do at this point sort of depends on
> what the raid folks think your chances are. If all you did was
> mount the fs and run recovery with the bad drive still in
> there then things may not be so bad. I am not sure you can
> flip out one drive and then do another in the middle of raid
> rebuild, that would almost certainly toast the volume. You
> might need to let raid rebuild complete on the current set of
> drives, and then replace the real bad drive with a good one.
Unfortunately, there was an unclean unmount the second time,
too. Here's the full sordid sequence of events:
- hdp giving errors (SectorIdNotFound).
- hdn fails (dma_status=0x00, or something like that); the
array goes into degraded mode.
- The system hangs on shutdown, and has to be taken down hard.
- I assume that hdp is actually the problem; I replace hdp.
(IDE is just that way sometimes...)
- When the box comes back up, the array isn't recognized.
- I mark hdp as a failed-drive and run mkraid -f. Now the
array is recognized.
- I mount the filesystem read-write. The data appears to be
- I raidhotadd hdp to the array. Reconstruction begins, but
stalls almost immediately. (/proc/mdstat reports 0K done and
a long, long time to finish.)
- I attempt to unmount the filesystem. It stalls. I attempt
to reboot; again, it stalls. I wait for a couple of minutes
before taking the box down hard.
And now, it looks like the probably-good drive may be heading
for a failure itself. :( I'm attempting to clone it before I
try putting it back in.
> Wait for the raid experts to respond on this point, do not take
> my word for it!
I think I might take a look at the raid code myself before I go
too far, just for the fun of it. Might be a useless exercise,
but worth a shot. Anyone know of any design docs other than the
code itself, to ease me into it?
> > > > How do I mount the filesystem without writing anything at
> > > > all to the array?
> > > mount -o ro,norecovery
> > >
> > > Even a readonly mount without the norecovery will attempt to run
> > > recovery.
> > So there's no way at all to mount the filesystem without some
> > writing occuring?
> No, if you use the combination of the two options then there should be
> no disk I/O at all.
Sorry; I misread your first reply.