On Fri, May 25, 2007 at 03:35:48AM +0200, Pallai Roland wrote:
> On Fri, 2007-05-25 at 10:05 +1000, David Chinner wrote:
> > > >It's a good question too, but I think the md layer could
> > > >save dumb filesystems like XFS if denies writes after 2 disks are
> > > >failed,
> > > >and
> > > >I cannot see a good reason why it's not behave this way.
> >
> > How is *any* filesystem supposed to know that the underlying block
> > device has gone bad if it is not returning errors?
> It is returning errors, I think so. If I try to write raid5 with 2
> failed disks with dd, I've got errors on the missing chunks.
Oh, did you look at your logs and find that XFS had spammed them
about writes that were failing?
> The difference between ext3 and XFS is that ext3 will remount to
> read-only on the first write error but the XFS won't, XFS only fails
> only the current operation, IMHO. The method of ext3 isn't perfect, but
> in practice, it's working well.
XFS will shutdown the filesystem if metadata corruption will occur
due to a failed write. We don't immediately fail the filesystem on
data write errors because on large systems you can get *transient*
I/O errors (e.g. FC path failover) and so retrying failed data
writes is useful for preventing unnecessary shutdowns of the
filesystem.
Different design criteria, different solutions...
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
|