On 08/28/13 06:22, Dave Chinner wrote:
From: Dave Chinner<dchinner@xxxxxxxxxx>
Log recovery has some strict ordering requirements which unordered
or reordered metadata writeback can defeat. This can occur when an
item is logged in a transaction, written back to disk, and then
logged in a new transaction before the tail of the log is moved past
the original modification.
The result of this is that when we read an object off disk for
recovery purposes, the buffer that we read may not contain the
object type that recovery is expecting and hence at the end of the
checkpoint being recovered we have an invalid object in memory.
This isn't usually a problem, as recovery will then replay all the
other checkpoints and that brings the object back to a valid and
correct state, but the issue is that while the object is in the
invalid state it can be flushed to disk. This results in the object
verifier failing and triggering a corruption shutdown of log
recover. This is correct behaviour for the verifiers - the problem
is that we are not detecting that the object we've read off disk is
newer than the transaction we are replaying.
All metadata in v5 filesystems has the LSN of it's last modification
stamped in it. This enabled log recover to read that field and
determine the age of the object on disk correctly. If the LSN of the
object on disk is older than the transaction being replayed, then we
replay the modification. If the LSN of the object matches or is more
recent than the transaction's LSN, then we should avoid overwriting
the object as that is what leads to the transient corrupt state.
Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx>
@@ -2488,7 +2595,7 @@ xlog_recover_buffer_pass2(
xlog_recover_do_reg_buffer(mp, item, bp, buf_f);
- return XFS_ERROR(error);
+ goto out_release;
This adds a xfs_buf_relse() on the buffer in the error path. The
reference was taken in this routine. The callers do not know of the
buffer and can't release it. convinced me.
* Perform delayed write on the buffer. Asynchronous writes will be
@@ -2517,6 +2624,7 @@ xlog_recover_buffer_pass2(
Looks good. Nice to get into Linux 3.12 and possibly back to stable.
Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>