[Top] [All Lists]

[PATCH] xfs: errors on sync superblock writes leave it locked

To: xfs@xxxxxxxxxxx
Subject: [PATCH] xfs: errors on sync superblock writes leave it locked
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 4 Jan 2011 15:50:09 +1100
From: Dave Chinner <dchinner@xxxxxxxxxx>

If we get an IO error on a synchronous superblock write, we attach a
error release function to it so that when the last reference goes
away the release function is called and the buffer is invalidated
and unlocked. The buffer is left locked until the release function
is called so that other concurrent users of the buffer will be
locked out until the buffer error is fully processed.

Unfortunately, for the superblock buffer the filesyetm itself holds
a reference to the buffer which prevents the reference count from
dropping to zero and the release function being called. As a result,
once an IO error occurs on a sync write, the buffer will never be
unlocked and all future attempts to lock the buffer will hang.

To make matters worse, this problems is not unique to such buffers;
if there is a concurrent _xfs_buf_find() running, the lookup will
grab a reference to the buffer and then wait on the buffer lock,
preventing the reference count from ever falling to zero and hence
unlocking the buffer.

As such, the whole b_relse function implementation is broken because
it cannot rely on the buffer reference count falling to zero to
unlock the errored buffer. The synchronous write error path is the
only path that uses this callback - it is used to ensure that the
synchronous waiter gets the buffer error before the error state is
cleared from the buffer by the release function.

Given that the only sychronous buffer writes now go through
xfs_bwrite() and the error path in question can only occur for a
write of a dirty, logged buffer, we can call the b_relse function
when an error is detected in xfs_bwrite() after calling
xfs_buf_iowait(). The subsequent xfs_buf_relse() call in
xfs_bwrite() will then unlock the buffer and everything should
continue as per normal.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
 fs/xfs/linux-2.6/xfs_buf.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c
index 92f1f2a..1775269 100644
--- a/fs/xfs/linux-2.6/xfs_buf.c
+++ b/fs/xfs/linux-2.6/xfs_buf.c
@@ -908,11 +908,8 @@ xfs_buf_rele(
        ASSERT(atomic_read(&bp->b_hold) > 0);
        if (atomic_dec_and_lock(&bp->b_hold, &pag->pag_buf_lock)) {
-               if (bp->b_relse) {
-                       atomic_inc(&bp->b_hold);
-                       spin_unlock(&pag->pag_buf_lock);
-                       bp->b_relse(bp);
-               } else if (!(bp->b_flags & XBF_STALE) &&
+               ASSERT(!bp->b_relse);
+               if (!(bp->b_flags & XBF_STALE) &&
                           atomic_read(&bp->b_lru_ref)) {
@@ -1112,8 +1109,16 @@ xfs_bwrite(
        error = xfs_buf_iowait(bp);
-       if (error)
+       if (error) {
+               /*
+                * If the error caused a release function to be set, call it
+                * now to clear the error from the buffer as we have already
+                * harvested it.
+                */
                xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
+               if (bp->b_relse)
+                       bp->b_relse(bp);
+       }
        return error;

<Prev in Thread] Current Thread [Next in Thread>