On Fri, Jul 27, 2007 at 01:01:15AM +0200, Andi Kleen wrote:
> David Chinner <dgc@xxxxxxx> writes:
> >
> > Nope. To do that, we'd need to implement some type of Reed-Solomon
> > coding and would need to use more bits on disk to store the ECC
> > data. That would have a much bigger impact on log throughput than a
> > table based CRC on a chunk of data that is hot in the CPU cache.
>
> Processing or rewriting cache hot data shouldn't be significantly
> different in cost (assuming the basic CPU usage of the algorithms
> is not too different); just the cache lines need to be already exclusive
> which is likely the case with logs.
*nod*
> > And we'd have to write the code as well. ;)
>
> Modern kernels have R-S functions in lib/reed_solomon. They
> are used in some of the flash file systems. I haven't checked
> how their performance compares to standard CRC though.
Ah, I didn't know that. I'll have a look at it....
Admittedly I didn't look all that hard because:
> > However, I'm not convinced that this sort of error correction is the
> > best thing to do at a high level as all the low level storage
> > already does Reed-Solomon based bit error correction. I'd much
> > prefer to use a different method of redundancy in the filesystem so
> > the error detection and correction schemes at different levels don't
> > have the same weaknesses.
>
> Agreed. On the file system level the best way to handle this is
> likely data duplicated on different blocks.
Yes, something like that. I haven't looked into all the potential
ways of providing redundancy yet - I'm still focussing on making
error detection more effective.
> > That means the filesystem needs strong enough CRCs to detect bit
> > errors and sufficient structure validity checking to detect gross
> > errors. XFS already does pretty good structure checking; we don't
>
> The trouble is that it tends to go to too drastic measures (shutdown) if it
> detects any inconsistency.
IMO, that's not drastic - it's the only sane thing to do in the
absence of redundant metadata that you can use to recover from. To
continue operations on a known corrupted filesystem risks making it
far, far worse, esp. if the corruption is in something like a free
space btree.
However, solving this is a separable problem - reliable error
correction comes after robust error detection....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
|