[Top] [All Lists]

Re: XFS handling of synchronous buffers in case of EIO error

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: XFS handling of synchronous buffers in case of EIO error
From: Ajeet Yadav <ajeet.yadav.77@xxxxxxxxx>
Date: Wed, 5 Jan 2011 17:26:31 +0900
Cc: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=T2e93KKd//8sony6s8fneB9TtMCSew7mHRQq03OhlwQ=; b=xMm+CstLVLe35HiKupbKafrGMku1PeCUn5uSNJ3Qox11RKuZHCEqGXZctTeyuAuwUy z0KrIUNtdKtrFMCrq4EnpwJ7zBRa26EHhchZdzDAh4gb1AlIbeEgtaO68Nzl9Gip5Aku k0jLnegd09gWvk0hBcW4ySzRLdkex69j0JZ1o=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Dh0+UZtTFmI2M8Jc3Yg9IpOlQdmTWUa63aE/PhTgXC+YJChvKVcFLrGO7UxonBI7lR 4UlyVo0T+OWrUqdzHwnSviyCR7Rw4alVN111zpwh49xhYdKHPuL7R8Un6QksDfKQCB7e oTWjMmQc1clteMDV2iaQGLeLzYGcdrVATJtRA=
In-reply-to: <20110104051947.GI15179@dastard>
References: <AANLkTi=Tmh9m_Rwy-bUZQEzcZ3M+6X9tZxFMO-J2Rvec@xxxxxxxxxxxxxx> <20101230231353.GC15179@dastard> <AANLkTi=OK4uGx5476ro8W47icu685gvQea43rNHozPKS@xxxxxxxxxxxxxx> <20110104051947.GI15179@dastard>
Thanks, I think its better to end this mail by rerering to your patch.

On Tue, Jan 4, 2011 at 2:19 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
On Fri, Dec 31, 2010 at 12:17:12PM +0530, Ajeet Yadav wrote:
> Dear Dave,
> Our Kernel is but XFS is backported from 2.6.34.
> But I have seen similar behaviour in another post related to process ls hang
> in
> *
> http://oss.sgi.com/pipermail/xfs/2010-December/048691.html
> *I have always seen the hang problem comes only if comes when b_relse !=
> NULL, and b_hold > 2
> I have made below workaround it solved the problem in our case because when
> USB is removed we know we get EIO error.
> But I think we need to review xfs_buf_error_relse() and xfs_buf_relse()
> considering  XBF_LOCK flow path.
> @@ -1047,9 +1047,19 @@ xfs_buf_iodone_callbacks(
>                         /* We actually overwrite the existing b-relse
>                            function at times, but we're gonna be shutting
> down
>                            anyway. */
> -                       XFS_BUF_SET_BRELSE_FUNC(bp,xfs_buf_error_relse);
> -                       XFS_BUF_DONE(bp);
> -                       XFS_BUF_FINISH_IOWAIT(bp);
> +                       if (XFS_BUF_GETERROR(bp) == EIO){
> +                               ASSERT(XFS_BUF_TARGET(bp) ==
> mp->m_ddev_targp);
> +                               XFS_BUF_SUPER_STALE(bp);
> +                               trace_xfs_buf_item_iodone(bp, _RET_IP_);
> +                               xfs_buf_do_callbacks(bp, lip);
> +                               XFS_BUF_SET_FSPRIVATE(bp, NULL);
> +                               XFS_BUF_CLR_IODONE_FUNC(bp);
> +                               xfs_biodone(bp);
> +                       } else {
> +
> XFS_BUF_SET_BRELSE_FUNC(bp,xfs_buf_error_relse);
> +                               XFS_BUF_DONE(bp);
> +                               XFS_BUF_FINISH_IOWAIT(bp);
> +                       }
>                 }
>                 return;
>         }

This won't work reliably because it only handles one specific type
of error. We can get more than just EIO back from the lower layers,
and so if the superblock write gets a different error then we'll
still get the same hang.

Effectively what you are doing here is running the
xfs_buf_error_relse() callback directly in line. This will result in
the buffer being unlocked before the error is pulled off the buffer
after xfs_buf_iowait() completes. Essentially that means that some
other thread can reuse the buffer and clear the error before the
waiter has received the error.

I think the correct fix is to call the bp->b_relse function when the
waiter is woken to clear the error and unlock the buffer. I've just
posted a patch to do this for 2.6.38, but it won't trivially backport
to 2.6.34 or 2.6.30 as the synchronous write interfaces into the
buffer cache have been cleaned up and simplified recently. It should
still be relatively easy to handle, though.


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>