On Mon, Jul 18, 2011 at 12:00:46PM -0400, Christoph Hellwig wrote:
> On Mon, Jul 18, 2011 at 01:49:47PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > We currently have significant issues with the amount of stack that
> > allocation in XFS uses, especially in the writeback path. We can
> > easily consume 4k of stack between mapping the page, manipulating
> > the bmap btree and allocating blocks from the free list. Not to
> > mention btree block readahead and other functionality that issues IO
> > in the allocation path.
> > As a result, we can no longer fit allocation in the writeback path
> > in the stack space provided on x86_64. To alleviate this problem,
> > introduce an allocation workqueue and move all allocations to a
> > seperate context. This can be easily added as an interposing layer
> > into xfs_alloc_vextent(), which takes a single argument structure
> > and does not return until the allocation is complete or has failed.
> I've mentioned before that I really don't like it, but I suspect there's
> not much of an way around it giving the small stacks, and significant
> amount of stacks that's already used above and below XFS.
> Can we at least have a sysctl nob or mount option to switch back to
> direct allocator calls so that we can still debug any performance
> or other issues with this one?
Honestly, I'd prefer not to do that because it's a slippery slope.
I've got plenty more "do stuff in the background via workqueues"
patches lined up, so if we start adding knobs/mount options to turn
each of them off "just in case there's an issue".
So far I haven't found any issues at all and I've been running this
split allocation stack like this in -all- my performance testing for
the past 2-3 months. I know that is not conclusive, but if the
bechmarks I've been using to improve XFS performance over the past
18 months don't show regressions, that's fairly indicative of the
fact that most workloads won't even notice the change....