[PATCH 2/2] xfs: check LSN ordering for v5 superblocks during recovery

Mark Tinguely tinguely at sgi.com
Wed Aug 28 15:49:30 CDT 2013


On 08/28/13 06:22, Dave Chinner wrote:
> From: Dave Chinner<dchinner at redhat.com>
>
> Log recovery has some strict ordering requirements which unordered
> or reordered metadata writeback can defeat. This can occur when an
> item is logged in a transaction, written back to disk, and then
> logged in a new transaction before the tail of the log is moved past
> the original modification.
>
> The result of this is that when we read an object off disk for
> recovery purposes, the buffer that we read may not contain the
> object type that recovery is expecting and hence at the end of the
> checkpoint being recovered we have an invalid object in memory.
>
> This isn't usually a problem, as recovery will then replay all the
> other checkpoints and that brings the object back to a valid and
> correct state, but the issue is that while the object is in the
> invalid state it can be flushed to disk. This results in the object
> verifier failing and triggering a corruption shutdown of log
> recover. This is correct behaviour for the verifiers - the problem
> is that we are not detecting that the object we've read off disk is
> newer than the transaction we are replaying.
>
> All metadata in v5 filesystems has the LSN of it's last modification
> stamped in it. This enabled log recover to read that field and
> determine the age of the object on disk correctly. If the LSN of the
> object on disk is older than the transaction being replayed, then we
> replay the modification. If the LSN of the object matches or is more
> recent than the transaction's LSN, then we should avoid overwriting
> the object as that is what leads to the transient corrupt state.
>
> Signed-off-by: Dave Chinner<dchinner at redhat.com>
> ---


> @@ -2488,7 +2595,7 @@ xlog_recover_buffer_pass2(
>   		xlog_recover_do_reg_buffer(mp, item, bp, buf_f);
>   	}
>   	if (error)
> -		return XFS_ERROR(error);
> +		goto out_release;
>

This adds a xfs_buf_relse() on the buffer in the error path. The 
reference was taken in this routine. The callers do not know of the 
buffer and can't release it. convinced me.


>   	/*
>   	 * Perform delayed write on the buffer.  Asynchronous writes will be
> @@ -2517,6 +2624,7 @@ xlog_recover_buffer_pass2(
>   		xfs_buf_delwri_queue(bp, buffer_list);
>   	}
>
> +out_release:
>   	xfs_buf_relse(bp);
>   	return error;

Looks good. Nice to get into Linux 3.12 and possibly back to stable.

Reviewed-by: Mark Tinguely <tinguely at sgi.com>



More information about the xfs mailing list