Date: Wed, 25 Jul 2007 19:24:45 +1000
From: David Chinner <dgc@xxxxxxx>
To: xfs-dev <xfs-dev@xxxxxxx>
Cc: xfs-oss <xfs@xxxxxxxxxxx>
Subject: RFC: log record CRC validation
Folks,
I've just fixed up the never-used-debug log record checksumming
code with an eye to permanently enabling it for production
filesystems.
Firstly, I updated the simple 32 bit wide XOR checksum to use the
crc32c module. This places an new dependency on XFS - it will now
depends on CONFIG_LIBCRC32C. I'm also not sure what the best
method to use is - the little endian or big endian CRC algorithm
so I just went for the default (crc32c()).
This then resulted in recovery failing to verify the checksums,
and it turns out that is because xfs_pack_data() gets passed a
padded buffer and size to checksum (padded to 512 bytes), whereas
the unpacking (recovery) only checksummed the unpadded record
length. Hence this code probably never worked reliably if anyone
ever enabled it.
This does bring up a question - probably for Tim - do we only get
rounded to BBs or do we get rounded to the log stripe unit when
packing the log records before writeout? It seems froma quick test
that it is only BBs, but confirmation would be good....
The next question is the hard one. What do we do when we detect
a log record CRC error? Right now it just warns and sets a flag
in the log. I think it should probably prevent log replay from
replaying past this point (i.e. trim the head back to the last
good log record) but I'm not sure what the best thing to do here.
Comments?
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group