xfs
[Top] [All Lists]

Re: [PATCH v3] xfs: re-enable xfsaild idle mode and fix associated races

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH v3] xfs: re-enable xfsaild idle mode and fix associated races
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Tue, 03 Jul 2012 09:13:57 -0400
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20120702235106.GU19223@dastard>
References: <1340880776-45730-1-git-send-email-bfoster@xxxxxxxxxx> <20120702000712.GN19223@dastard> <4FF1A324.7070603@xxxxxxxxxx> <20120702235106.GU19223@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0
On 07/02/2012 07:51 PM, Dave Chinner wrote:
> On Mon, Jul 02, 2012 at 09:33:24AM -0400, Brian Foster wrote:
>> On 07/01/2012 08:07 PM, Dave Chinner wrote:
>>> On Thu, Jun 28, 2012 at 06:52:56AM -0400, Brian Foster wrote:
>>>> xfsaild idle mode logic currently leads to a couple hangs:
>>>>
>>>> 1.) If xfsaild is rescheduled in during an incremental scan
>>>>     (i.e., tout != 0) and the target has been updated since
>>>>     the previous run, we can hit the new target and go into
>>>>     idle mode with a still populated ail.
>>>> 2.) A wake up is only issued when the target is pushed forward.
>>>>     The wake up can race with xfsaild if it is currently in the
>>>>     process of entering idle mode, causing future wake up
>>>>     events to be lost.
>>>>
>>>> These hangs have been reproduced and verified as fixed by
>>>> running xfstests 273 in a loop on a slightly modified upstream
>>>> kernel. The kernel is modified to re-enable idle mode as
>>>> previously implemented (when count == 0) and with a revert of
>>>> commit 670ce93f, which includes performance improvements that
>>>> make this harder to reproduce.
>>>>
>>>> The solution, the algorithm for which has been outlined by
>>>> Dave Chinner, is to modify xfsaild to enter idle mode only when
>>>> the ail is empty and the push target has not been moved forward
>>>> since the last push.
>>>>
>>>> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
>>>
>>> Looks OK to me, and hasn't caused any problems here.
>>>
>>> Final question - did you confirm with powertop that the xfsaild is
>>> no longer causing wakeups a minute or two after you stop writing to
>>> the filesystem? (I haven't yet)
>>>
>>
>> I hadn't tested with powertop, but I had some tracepoints hacked in
>> around the idle/wake cases to verify the thread was actually scheduling
>> out.
> 
> If you've added tracepoints that were useful for
> debugging/verification, then send that as a patch as well. If users
> have trouble then simply asking them for event traces is very easy
> to do and gives us much better insight into what is happening....
> 
> You can't have enough tracepoints when things are going wrong ;)
> 

Ok, duly noted. What I have right now is scattered about a few branches
and not immediately presentable. When I get some time I'll fix them up
and post. If I remember correctly, I had covered: xfsaild end (count,
skip, target, etc.), xfsaild idle, xa_target update (xfs_ail_push()) and
xfsaild wake (which might be extraneous at this point).

Brian

>> FWIW, I just gave powertop a quick test and it appears to work as
>> expected...
>>
>> With current upstream on my rhel6.3 VM, I see the following after
>> running a 'touch /mnt/file;sync' and letting the fs idle for a bit:
>>
>>    0.5% ( 19.9)      xfsaild/vdb1 : xfsaild (process_timeout)
>>
>> and this drops off completely with the patch applied. Thanks for the tip.
> 
> Great, then it is working exactly as expected.
> 
> Cheers,
> 
> Dave.
> 


<Prev in Thread] Current Thread [Next in Thread>