> We find out about corrupted filesystems all the time from users
> sending mail to the list. Even if we did panic by default on
> corruption events, kerneloops.org is *useless* for reporting them
> because finding out about a corruption is only the very first step
> of what is usually a long and involved process that requires user
> interaction to gather information necessary to find the cause of the
The idea behind kerneloops.org
is normally that any single report can be always a flake
(broken memory, hardware, flipped bit whatever).
An error becomes important and interesting when there are multiple
occurrences of it in the field.
> Besides, if we _really_ want the machine to panic on corruption,
BUG_ON is not panic normally.
> then we configure the machine specifically for it via setting the
> relevant corruption type bit in /proc/sys/fs/xfs/panic_mask. This is
> generally only used when a developer asks a user to set it to get
> kernel crash dumps triggered when a corruption event occurs so we
> can do remote, offline analysis of the failure.
Especially when you're talking about desktop class systems
without ECC memory that will mean you'll spend at least some
time on errors which are simply bit flips.
> > That's standard Linux kernel development
> > practice. Maybe XFS should catch up on that.
> I find this really amusing because linux filesystems have, over the
This has really nothing to do with file systems, it's general
practice for everything (well except XFS)
> last few years, implemented a simpler version of XFS's way of
> dealing with corruption events(*). Perhaps you should catch up
> with the state of the art before throwing rocks, Andi....
I suspect you miss quite a lot of valuable information from
your user base by not supporting kerneloops.org. On the other
hand it would likely also save you from spending time on
That said you don't need BUG_ON to support it (WARN etc. work
too), it's just the easiest way.
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.