xfs
[Top] [All Lists]

Re: [3.0-stable PATCH 36/36] xfs: only update the last_sync_lsn when a t

To: Mark Tinguely <tinguely@xxxxxxx>
Subject: Re: [3.0-stable PATCH 36/36] xfs: only update the last_sync_lsn when a transaction completes
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 12 Dec 2012 10:59:52 +1100
Cc: stable@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
In-reply-to: <20121203144312.161697268@xxxxxxx>
References: <20121203144208.143464631@xxxxxxx> <20121203144312.161697268@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Dec 03, 2012 at 05:42:44PM -0600, Mark Tinguely wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Upstream commit: d35e88faa3b0fc2cea35c3b2dca358b5cd09b45f
> 
>     xfs: only update the last_sync_lsn when a transaction completes
>     
>     The log write code stamps each iclog with the current tail LSN in
>     the iclog header so that recovery knows where to find the tail of
>     thelog once it has found the head. Normally this is taken from the
>     first item on the AIL - the log item that corresponds to the oldest
>     active item in the log.
>     
>     The problem is that when the AIL is empty, the tail lsn is dervied
>     from the the l_last_sync_lsn, which is the LSN of the last iclog to
>     be written to the log. In most cases this doesn't happen, because
>     the AIL is rarely empty on an active filesystem. However, when it
>     does, it opens up an interesting case when the transaction being
>     committed to the iclog spans multiple iclogs.
>     
>     That is, the first iclog is stamped with the l_last_sync_lsn, and IO
>     is issued. Then the next iclog is setup, the changes copied into the
>     iclog (takes some time), and then the l_last_sync_lsn is stamped
>     into the header and IO is issued. This is still the same
>     transaction, so the tail lsn of both iclogs must be the same for log
>     recovery to find the entire transaction to be able to replay it.
>     
>     The problem arises in that the iclog buffer IO completion updates
>     the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog
>     completes it's IO before the second iclog is filled and has the tail
>     lsn stamped in it, it will stamp the LSN of the first iclog into
>     it's tail lsn field. If the system fails at this point, log recovery
>     will not see a complete transaction, so the transaction will no be
>     replayed.
>     
>     The fix is simple - the l_last_sync_lsn is updated when a iclog
>     buffer IO completes, and this is incorrect. The l_last_sync_lsn
>     shoul dbe updated when a transaction is completed by a iclog buffer
>     IO. That is, only iclog buffers that have transaction commit
>     callbacks attached to them should update the l_last_sync_lsn. This
>     means that the last_sync_lsn will only move forward when a commit
>     record it written, not in the middle of a large transaction that is
>     rolling through multiple iclog buffers.
>     
>     Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
>     Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
>     Reviewed-by: Christoph Hellwig <hch@xxxxxx>
>     Signed-off-by: Ben Myers <bpm@xxxxxxx>

This needs to go back to all the 3.x.y stable trees, so probably
should be sent separately to this series....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>