[Top] [All Lists]

Re: [PATCH v2 1/2] xfs: conditionally force log on trylock failure of pi

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: [PATCH v2 1/2] xfs: conditionally force log on trylock failure of pinned/stale buf
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 7 Feb 2013 11:57:25 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1360154681-28246-2-git-send-email-bfoster@xxxxxxxxxx>
References: <1360154681-28246-1-git-send-email-bfoster@xxxxxxxxxx> <1360154681-28246-2-git-send-email-bfoster@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Feb 06, 2013 at 07:44:40AM -0500, Brian Foster wrote:
> xfs_force_log() is not safe from all contexts. Add a flag parameter
> to xfs_buf_trylock() to specify when the force is appropriate and
> create a macro to preserve current behavior.

Static inline functions are preferred over macros because they
provide type checking.....

However, here's what happens when I actually have time to think
about the problem at hand - I suggest a completely different fix.

Sorry, Brain, I'm not doing this intentionally to make your life
harder than it needs to be... :/

My logic (goes back to the history of changes that lead to the log
force being in xfs_buf_trylock()) is as follows:

        1. xfs_buf_lock() had the log force added originally because
           _xfs_buf_find was getting stuck on locked, pinned stale
           buffers. This was a result of the rework of the busy
           extent handling to avoid synchronous transactions and/or
           log forces when reusing recently freed blocks. Somewhere
           a log force was needed to get the freeing transaction on
           disk before reuse occurred, and that was done on demand
           in xfs_buf_lock() when such a stale buffer was being
           used after a subsequent allocation.

        2a. xfs_inode_item_push() used to flush inodes to the
           the underlying buffer and would skip the flush if the
           underlying buffer was locked. Hence pinned inodes could
           not be flushed if the underlying buffer was locked (i.e.
           when pinned and stale).

        2b. xfs_buf_item_push() would fail to lock the buffer if it
           raced with pinning and do nothing.

        3. this resulted in neither xfs_inode_item_push() or
           xfs_buf_item_push() telling the xfsaild to issue a log
           force, the AIL would stop doing work as finished it's
           traverse, and hence tranaction reservations stalled until
           something else issued a log force and unpinned the tail
           of the log.

        4. the simple and obvious fix at the time was to have
           xfs_buf_trylock() do the same as xfs_buf_lock() and force
           the log when a buffer in the state that caused the stall
           was detected.

        <time passes>

        5a. delayed write buffers were reworked, completely changing
           how inode flushing and hence xfs_inode_item_push()

        5b. delayed write buffers were reworked, changing how
           xfs_buf_item_push() worked.

        6. several bugs in pinned/stale item handling were fixed,
           and now both xfs_inode_item_push() and
           xfs_buf_item_push() detect pinned inodes and buffers and
           tell the xfsaild to force the log appropriately.  This
           means xfs_buf_trylock() no longer needs to force the log
           to provide a get-out-of-xfsaild-stall-free mechanism.

        7. This log force while holding the ail lock bug is discovered.

So what I'm thinking is that as a result of 5+6, we have no need for
the log force in the xfs_buf_trylock code any more. The only place
we actually care about locking latency any more is in
_xfs_buf_find(), and when the trylock fails there we immediately run
xfs_buf_lock() (unless XBF_TRYLOCK was set) and it will do the log
force for us.

IOWs, I think the log force code in xfs_buf_trylock() is simply
redundant and we should just remove it. With the second patch in
this series, racing on the buffer being pinned will now do the right
thing (i.e. trigger a log force via the xfsaild) and so we end up
both simplifying the code and fixing the bug....


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>