[PATCH] xfs: only update the last_sync_lsn when a transaction completes

Mark Tinguely tinguely at sgi.com
Fri Sep 28 12:46:48 CDT 2012


On 09/27/12 23:37, Dave Chinner wrote:
> From: Dave Chinner<dchinner at redhat.com>
>
> The log write code stamps each iclog with the current tail LSN in
> the iclog header so that recovery knows where to find the tail of
> thelog once it has found the head. Normally this is taken from the
> first item on the AIL - the log item that corresponds to the oldest
> active item in the log.
>
> The problem is that when the AIL is empty, the tail lsn is dervied
> from the the l_last_sync_lsn, which is the LSN of the last iclog to
> be written to the log. In most cases this doesn't happen, because
> the AIL is rarely empty on an active filesystem. However, when it
> does, it opens up an interesting case when the transaction being
> committed to the iclog spans multiple iclogs.
>
> That is, the first iclog is stamped with the l_last_sync_lsn, and IO
> is issued. Then the next iclog is setup, the changes copied into the
> iclog (takes some time), and then the l_last_sync_lsn is stamped
> into the header and IO is issued. This is still the same
> transaction, so the tail lsn of both iclogs must be the same for log
> recovery to find the entire transaction to be able to replay it.
>
> The problem arises in that the iclog buffer IO completion updates
> the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog
> completes it's IO before the second iclog is filled and has the tail
> lsn stamped in it, it will stamp the LSN of the first iclog into
> it's tail lsn field. If the system fails at this point, log recovery
> will not see a complete transaction, so the transaction will no be
> replayed.
>
> The fix is simple - the l_last_sync_lsn is updated when a iclog
> buffer IO completes, and this is incorrect. The l_last_sync_lsn
> shoul dbe updated when a transaction is completed by a iclog buffer
> IO. That is, only iclog buffers that have transaction commit
> callbacks attached to them should update the l_last_sync_lsn. This
> means that the last_sync_lsn will only move forward when a commit
> record it written, not in the middle of a large transaction that is
> rolling through multiple iclog buffers.
>
> Signed-off-by: Dave Chinner<dchinner at redhat.com>
> ---

Makes a lot of sense. Seems to clean up wrap warnings
and hangs that I started to see in xfstest 273.

Reviewed-by: Mark Tinguely <tinguely at sgi.com>



More information about the xfs mailing list