xfs
[Top] [All Lists]

Re: [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_pus

To: Xu Wang <xuw@xxxxxxxxxx>
Subject: Re: [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Fri, 10 Oct 2014 08:07:54 -0400
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <2090211370.42327532.1412926306812.JavaMail.zimbra@xxxxxxxxxx>
References: <929991500.42327290.1412926226461.JavaMail.zimbra@xxxxxxxxxx> <2090211370.42327532.1412926306812.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)
On Fri, Oct 10, 2014 at 03:31:46AM -0400, Xu Wang wrote:
> From 59c6babab7c4ce8708036018143d2acc1477cc7f Mon Sep 17 00:00:00 2001
> From: George Wang <xuw@xxxxxxxxxx>
> Date: Fri, 10 Oct 2014 15:02:14 +0800
> Subject: [PATCH] [PATCH] xfs: lock for removing item from cil->xc_cil in
>  xlog_cil_push
> 
> There is a race condition when xlog_cil_force_lsn and xlog_cil_push for
> cil->xc_cil items. When function xlog_cil_push_now called in
> xlog_cil_force_lsn, and the xlog_cil_push run in workqueue just reaches the 
> point for
> setting cil->xc_current_sequence, the xlog_cil_force_lsn got the lock
> and check the "sequence == cil->xc_current_sequence &&
> !list_empty(&cil->xc_cil)" for restart. The statement "sequence ==
> cil->xc_current_sequence" is true, but the cil->xc_cil is empty
> according to the xlog_cil_push removed the items in cil->xc_cil without
> lock protectionl. So the function xlog_cil_force_lsn will return
> NULLCOMMITLSN, which means the cil log for current sequence will not be
> submitted to disk. And if there is no more operations(for example, when
> unmount fs, this situation happened in
> xfs_unmountfs->xfs_log_sbcount->xfs_trans_commit), the xfs will hang up.
> 
> Signed-off-by: George Wang <xuw@xxxxxxxxxx>
> ---

Isn't this fixed by the following commit?

8af3dcd3 xfs: xlog_cil_force_lsn doesn't always wait correctly

With that patch, xlog_cil_push() adds the ctx to the committing list
before draining the ctx and updating the current sequence number.
xlog_cil_force_lsn() walks the committing list and checks the sequence
number and ctx list, all under the push lock.

That means that xlog_cil_force_lsn() should see the ctx either on the
committing list and wait for it (commit_lsn != NULL), or not on the
committing list at all. If it's not on the committing list yet,
xlog_cil_push() won't have either drained the ctx or updated the
sequence number, as it adds to the committing list first and that
requires the push lock (which xlog_cil_force_lsn() holds and doesn't
release until after the restart check).

Am I missing something? Do you reproduce a problem with the latest tree
that includes the above patch?

Brian

>  fs/xfs/xfs_log_cil.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index f6b79e5..97ba527 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -478,6 +478,8 @@ xlog_cil_push(
>      */
>     lv = NULL;
>     num_iovecs = 0;
> +
> +   spin_lock(&cil->xc_push_lock);
>     while (!list_empty(&cil->xc_cil)) {
>         struct xfs_log_item *item;
> 
> @@ -530,7 +532,6 @@ xlog_cil_push(
>      * against the current sequence in log forces without risking
>      * deferencing a freed context pointer.
>      */
> -   spin_lock(&cil->xc_push_lock);
>     cil->xc_current_sequence = new_ctx->sequence;
>     list_add(&ctx->committing, &cil->xc_committing);
>     spin_unlock(&cil->xc_push_lock);
> -- 
> 1.9.3
> 
> 
> -- 
> George Wang çæ
> 
> Kernel Quantity Engineer
> Red Hat Software (Beijing) Co.,Ltd
> IRC:xuw
> Tel:+86-010-62608041
> Phone:15901231579
> 9/F, Tower C, Raycom
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>