xfs
[Top] [All Lists]

Re: Oops - XFS mount after replacing wrong RAID5 drive

To: Andrew Klaassen <ak@xxxxxxx>
Subject: Re: Oops - XFS mount after replacing wrong RAID5 drive
From: Steve Lord <lord@xxxxxxx>
Date: 05 Nov 2001 11:21:02 -0600
Cc: linux-xfs@xxxxxxxxxxx, linux-raid@xxxxxxxxxxxxxxx
In-reply-to: <20011105112231.B3864@dkp.com>
References: <20011105103521.A3864@dkp.com> <1004974636.7318.5.camel@jen.americas.sgi.com> <20011105112231.B3864@dkp.com>
Sender: owner-linux-xfs@xxxxxxxxxxx
On Mon, 2001-11-05 at 10:22, Andrew Klaassen wrote:
> On Mon, Nov 05, 2001 at 09:37:16AM -0600,
> Steve Lord wrote:
> 
> > On Mon, 2001-11-05 at 09:35,
> > Andrew Klaassen wrote:
> 
> > > 
> > > (The XFS filesystem did not unmount cleanly after the first
> > > drive failure.  That's why I'm assuming that it replayed its
> > > log when I mounted it after replacing the drive.)
> 
> > It will have replayed its log - and that information is now
> > gone, so it is a little hard to say what state the filesystem
> > is really in now.
> 
> So... mount it ro,norecovery, then run xfs_repair -n?  Or just
> mount it ro,norecovery and try to grab the info we absolutely
> need?

I was assuming you had replaced the wrong drive and remounted the
filesystem, at which point recovery would have run. Once the log
is updated then recovery will not run again - although if you switch
drives again it is very hard to say what will happen - it depends on
which portions of the fs were on the bad drive. Given raid5 this is
not something I could predict.

xfs_repair -n is only useful if you want to see how inconsistent the
filesystem is. Sounds like the ro,norecovery options and an
xfs_repair -n would be a good thing to do if you did swap back in
the good drive. You can also use xfs_logprint -t to give you an idea
of how much will happen during log recovery, although the output is 
really pretty developer centric.

What you should actually do at this point sort of depends on what the
raid folks think your chances are. If all you did was mount the fs and
run recovery with the bad drive still in there then things may not be
so bad. I am not sure you can flip out one drive and then do another
in the middle of raid rebuild, that would almost certainly toast the
volume. You might need to let raid rebuild complete on the current
set of drives, and then replace the real bad drive with a good one.
Wait for the raid experts to respond on this point, do not take
my word for it!

> 
> How much is it likely to have written while replaying the log? 
> (There's a reasonably good chance that it sync'd before the box
> went down the first time.)

In that case there would have been nothing much in there - xfs writes
out a record into the log after a period of inactivity which basically
marks the log as empty. However, another record is written at unmount
time, if this unmount record is not present then xfs will assume a
crash and replay the log - which can consist of doing nothing.

> 
> For the RAID5 people:  How much has to be written to the array
> before the old probably-good drive will be useless?  What will
> happen if I put it back in and it's too far out of sync?
> 
> > > How do I mount the filesystem without writing anything at
> > > all to the array?
> 
> > mount -o ro,norecovery
> > 
> > Even a readonly mount without the norecovery will attempt to run
> > recovery.
> 
> So there's no way at all to mount the filesystem without some
> writing occuring?

No, if you use the combination of the two options then there should be
no disk I/O at all.

Steve

-- 

Steve Lord                                      voice: +1-651-683-3511
Principal Engineer, Filesystem Software         email: lord@xxxxxxx


<Prev in Thread] Current Thread [Next in Thread>