Metadata CRC error upon unclean unmount
Mark Tinguely
tinguely at sgi.com
Thu Jun 26 15:03:53 CDT 2014
On 06/25/14 19:28, Dave Chinner wrote:
> On Wed, Jun 25, 2014 at 11:21:44AM +1000, Dave Chinner wrote:
>> On Tue, Jun 24, 2014 at 11:50:52PM +0200, Fanael Linithien wrote:
>> Ok, so the CRC corresponds to the version of the AGI that was logged
>> at lsn = 0x30000017e. That means the version on disk is a partial
>> update without a CRC recalculation. Ok, so how can that happen?
>>
>> Given the lsn mismatch, I suspect log recovery has played a part as
>> it will not update the LSN when replaying changes in the log. It
>> should, however, always be attaching the appropriate verifier to
>> the buffers being recovered so the CRC should be recalculated
>> correctly.
>
> Ok, I have confirmed that this is occurring and behaving correctly.
>
> [ 24.437878] XFS (vdb): Mounting V5 Filesystem
> [ 24.554429] XFS (vdb): Starting recovery (logdev: internal)
> [ 24.623466] XFS (vdb): xfs_agi_write_verify: lsn reset block 0x2
> [ 24.625263] XFS (vdb): xfs_btree_sblock_calc_crc: lsn reset block 0x8
> [ 24.627307] XFS (vdb): xfs_btree_sblock_calc_crc: lsn reset block 0x10
> [ 24.628729] XFS (vdb): xfs_btree_sblock_calc_crc: lsn reset block 0x18
> [ 24.630085] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x20000
> [ 24.631504] XFS (vdb): xfs_da3_node_write_verify: lsn reset block 0x20008
> [ 24.632935] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x20010
> [ 24.634360] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x20018
> [ 24.635622] XFS (vdb): xfs_dir3_free_write_verify: lsn reset block 0x201e0
> [ 24.636656] XFS (vdb): __write_verify: lsn reset block 0x201e8
> [ 24.637510] XFS (vdb): __write_verify: lsn reset block 0x201f0
> [ 24.638365] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x201f8
> [ 24.639378] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x202c0
> [ 24.640397] XFS (vdb): __write_verify: lsn reset block 0x202c8
> [ 24.641260] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x202d0
> [ 24.664330] XFS (vdb): Ending recovery (logdev: internal)
>
> But that also confirms that log recovery is recalculating the CRC
> after replaying changes into that block:
>
> # xfs_db -c "agi 0" -c "p lsn" -c "p crc" /dev/vdb
> lsn = 0xffffffffffffffff
> crc = 0x788c4f63 (correct)
>
> So the common log recovery path for buffers is working as it is
> designed to do.
>
> What I still don't understand yet is how changes after this recovery
> phase are getting to disk without updating the CRC. That implies
> buffers without verifiers being written....
>
> More debug to come...
>
> Cheers,
>
> Dave.
Could an out of order CIL push cause this?
SGI saw sequence 2 (and sometimes 3/4) of the cil push get in front of
cil push sequence 1. Looks like the setting of
log->l_cilp->xc_ctx->commit_lsn in xlog_cil_init_post_recovery() lets
this happen.
--Mark.
More information about the xfs
mailing list