On Mon, Jun 14, 2010 at 04:37:20PM +0200, Andi Kleen wrote:
> > > > function head comment during development. Anyway, if we do get an
> > > > error here, we cannot handle it anyway - it's too late to do
> > > > anything short of a complete shutdown as we've already written the
> > > > transaction to the log.
> > >
> > > Well I guess it should be unconditional BUG_ON then.
> > Don't be silly. A filesystem shutdown is all that is necessary,
> Without BUG_ON it will not end up in kerneloops.org and you will
> never know about it.
We find out about corrupted filesystems all the time from users
sending mail to the list. Even if we did panic by default on
corruption events, kerneloops.org is *useless* for reporting them
because finding out about a corruption is only the very first step
of what is usually a long and involved process that requires user
interaction to gather information necessary to find the cause of the
Besides, if we _really_ want the machine to panic on corruption,
then we configure the machine specifically for it via setting the
relevant corruption type bit in /proc/sys/fs/xfs/panic_mask. This is
generally only used when a developer asks a user to set it to get
kernel crash dumps triggered when a corruption event occurs so we
can do remote, offline analysis of the failure.
> That's standard Linux kernel development
> practice. Maybe XFS should catch up on that.
I find this really amusing because linux filesystems have, over the
last few years, implemented a simpler version of XFS's way of
dealing with corruption events(*). Perhaps you should catch up
with the state of the art before throwing rocks, Andi....
(*) extN, fat, hpfs, jfs, nilfs2, ntfs, ocfs2 and logfs all have
configurable corruption event behaviour that default to remount-ro
and can be configured to panic the machine.