[PATCH 09/10] xfs: on-stack delayed write buffer lists
Mark Tinguely
tinguely at sgi.com
Fri Apr 20 13:19:46 CDT 2012
On 03/27/12 11:44, Christoph Hellwig wrote:
> Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
> and write back the buffers per-process instead of by waking up xfsbufd.
>
> This is now easily doable given that we have very few places left that write
> delwri buffers:
>
> - log recovery:
> Only done at mount time, and already forcing out the buffers
> synchronously using xfs_flush_buftarg
>
> - quotacheck:
> Same story.
>
> - dquot reclaim:
> Writes out dirty dquots on the LRU under memory pressure. We might
> want to look into doing more of this via xfsaild, but it's already
> more optimal than the synchronous inode reclaim that writes each
> buffer synchronously.
>
> - xfsaild:
> This is the main beneficiary of the change. By keeping a local list
> of buffers to write we reduce latency of writing out buffers, and
> more importably we can remove all the delwri list promotions which
> were hitting the buffer cache hard under sustained metadata loads.
>
> The implementation is very straight forward - xfs_buf_delwri_queue now gets
> a new list_head pointer that it adds the delwri buffers to, and all callers
> need to eventually submit the list using xfs_buf_delwi_submit or
> xfs_buf_delwi_submit_nowait. Buffers that already are on a delwri list are
> skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
> list. The biggest change to pass down the buffer list was done to the AIL
> pushing. Now that we operate on buffers the trylock, push and pushbuf log
> item methods are merged into a single push routine, which tries to lock the
> item, and if possible add the buffer that needs writeback to the buffer list.
> This leads to much simpler code than the previous split but requires the
> individual IOP_PUSH instances to unlock and reacquire the AIL around calls
> to blocking routines.
>
> Given that xfsailds now also handles writing out buffers the conditions for
> log forcing and the sleep times needed some small changes. The most
> important one is that we consider an AIL busy as long we still have buffers
> to push, and the other one is that we do increment the pushed LSN for
> buffers that are under flushing at this moment, but still count them towards
> the stuck items for restart purposes. Without this we could hammer on stuck
> items without ever forcing the log and not make progress under heavy random
> delete workloads on fast flash storage devices.
>
> Signed-off-by: Christoph Hellwig<hch at lst.de>
Test 106 runs to completion with patch 06.
Patch 07 and 08 do not compile without patch 09.
Starting with patch 09, I get the following hang on every test 106:
ID: 27992 TASK: ffff8808310d00c0 CPU: 2 COMMAND: "mount"
#0 [ffff880834237938] __schedule at ffffffff81417200
#1 [ffff880834237a80] schedule at ffffffff81417574
#2 [ffff880834237a90] schedule_timeout at ffffffff81415805
#3 [ffff880834237b30] wait_for_common at ffffffff81416a67
#4 [ffff880834237bc0] wait_for_completion at ffffffff81416bd8
#5 [ffff880834237bd0] xfs_buf_iowait at ffffffffa04fc5a5 [xfs]
#6 [ffff880834237c00] xfs_buf_delwri_submit at ffffffffa04fe4b9 [xfs]
#7 [ffff880834237c40] xfs_qm_quotacheck at ffffffffa055cb2d [xfs]
#8 [ffff880834237cc0] xfs_qm_mount_quotas at ffffffffa055cdf0 [xfs]
#9 [ffff880834237cf0] xfs_mountfs at ffffffffa054c041 [xfs]
#10 [ffff880834237d40] xfs_fs_fill_super at ffffffffa050ca80 [xfs]
#11 [ffff880834237d70] mount_bdev at ffffffff81150c5c
#12 [ffff880834237de0] xfs_fs_mount at ffffffffa050ac00 [xfs]
#13 [ffff880834237df0] mount_fs at ffffffff811505f8
#14 [ffff880834237e40] vfs_kern_mount at ffffffff8116c070
#15 [ffff880834237e80] do_kern_mount at ffffffff8116c16e
#16 [ffff880834237ec0] do_mount at ffffffff8116d6f0
#17 [ffff880834237f20] sys_mount at ffffffff8116d7f3
#18 [ffff880834237f80] system_call_fastpath at ffffffff814203b9
The workers seem to be idle. For example the xfsaild:
PID: 27676 TASK: ffff880832880240 CPU: 3 COMMAND: "xfsaild/sda7"
#0 [ffff880832933cb0] __schedule at ffffffff81417200
#1 [ffff880832933df8] schedule at ffffffff81417574
#2 [ffff880832933e08] schedule_timeout at ffffffff81415805
#3 [ffff880832933ea8] xfsaild at ffffffffa0555935 [xfs]
#4 [ffff880832933ee8] kthread at ffffffff8105dd6e
#5 [ffff880832933f48] kernel_thread_helper at ffffffff814216a4
The hang is on the third quotacheck.
Should be easy to duplicate this.
--Mark Tinguely.
More information about the xfs
mailing list