xfs
[Top] [All Lists]

Re: [RFC PATCH v3 2/2] xfs: fix xfsaild hang due to lost wake ups

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: [RFC PATCH v3 2/2] xfs: fix xfsaild hang due to lost wake ups
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 24 May 2012 10:06:26 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4FBD2306.8090000@xxxxxxxxxx>
References: <1337704714-50235-1-git-send-email-bfoster@xxxxxxxxxx> <1337704714-50235-3-git-send-email-bfoster@xxxxxxxxxx> <20120523005830.GL25351@dastard> <4FBD2306.8090000@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, May 23, 2012 at 01:48:54PM -0400, Brian Foster wrote:
> On 05/22/2012 08:58 PM, Dave Chinner wrote: snip
> > 
> > Finally, rather than calling wake_up_process() in the
> > xfs_ail_push*() functions, call wake_up(&ailp->xa_idle); There
> > can only be one thread sleeping on that (the xfsaild) so there
> > is no need to use the wake_up_all() variant...
> > 
> > FWIW, you might be able to do this without the idle wait queue
> > and just use wake_up_process() - 
> > 
> 
> Hi Dave,
> 
> I have a working version of your suggested algorithm. It looks
> mostly the same with the exception of a spin_unlock fix. I also
> have the below version that uses a wait_queue and that I plan to
> test overnight tonight:

See my previous mail about using an idle queue.

>       while (!kthread_should_stop()) {
>               if (tout && tout <= 20)
>                       state = TASK_KILLABLE;
>               else
>                       state = TASK_INTERRUPTIBLE;
> 
>               prepare_to_wait(&ailp->xa_idle, &wait, state);
> 
>               spin_lock(&ailp->xa_lock);
>               /* barrier matches the xa_target update in xfs_ail_push() */
>               smp_rmb();
>               if (!xfs_ail_min(ailp) && (ailp->xa_target == 
> ailp->xa_target_prev)) {
>                       /* the ail is empty and no change to the push target - 
> idle */
>                       spin_unlock(&ailp->xa_lock);
>                       schedule();
>               } else if (tout) {
>                       spin_unlock(&ailp->xa_lock);
>                       /* more work to do soon */
>                       schedule_timeout(msecs_to_jiffies(tout));
>               } else {
>                       spin_unlock(&ailp->xa_lock);
>               }

Three separate unlocks? that's a recipe for future disasters. how
about:

                if (!xfs_ail_min(ailp) && (ailp->xa_target == 
ailp->xa_target_prev)) {
                        /* the ail is empty and no change to the push target - 
idle */
                        spin_unlock(&ailp->xa_lock);
                        schedule();
                        tout = 0;
                        continue;
                }
                spin_unlock(&ailp->xa_lock);

                if (tout) {
                        /* more work to do soon */
                        schedule_timeout(msecs_to_jiffies(tout));
                }

So that we recheck the idle condition on wakeup from idle before
doing anything. (i.e. handle spurious idle wakeups effectively). By
setting the tout to zero, we then fall through immediately to
pushing the AIL if it was a real wakeup that moved the target....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>