[PATCH] xfs: pull up stack_switch check into xfs_bmapi_write
Ben Myers
bpm at sgi.com
Fri Jan 18 16:55:43 CST 2013
Hey,
On Fri, Jan 18, 2013 at 11:15:49AM +1100, Dave Chinner wrote:
> On Thu, Jan 17, 2013 at 05:50:39PM -0600, Ben Myers wrote:
> > Hey Dave,
> >
> > On Fri, Jan 18, 2013 at 10:39:22AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 17, 2013 at 03:41:08PM -0600, Ben Myers wrote:
> > > > Hey Brian,
> > > >
> > > > On Thu, Jan 17, 2013 at 01:11:29PM -0500, Brian Foster wrote:
> > > > > The stack_switch check currently occurs in __xfs_bmapi_allocate,
> > > > > which means the stack switch only occurs when xfs_bmapi_allocate()
> > > > > is called in a loop. Pull the check up before the loop in
> > > > > xfs_bmapi_write() such that the first iteration of the loop has
> > > > > consistent behavior.
> > > > >
> > > > > Signed-off-by: Brian Foster <bfoster at redhat.com>
> > > > > ---
> > > > >
> > > > > I was reading through this code and confused myself over whether the stack
> > > > > switch ever actually occurs. Eric and Ben pointed out on irc (simultaneously,
> > > > > I might add) the surrounding loop that I had missed, but it wasn't clear whether
> > > > > the behavior to enable the stack switch after the first iteration was
> > > > > intentional or not. I'm throwing this out there to either fix the issue or seek
> > > > > out an explanation for the existing behavior. Thanks!
> > > >
> > > > To me this looks to be the correct behavior. It might be better to
> > > > just get rid of the XFS_BMAPI_STACK_SWITCH flag entirely. Nice find.
> > >
> > > Which would take it back to the original logic which always switched
> > > stacks and we know that caused significant metadata performance
> > > degradation in various workloads.
> > >
> > > If we want to remove XFS_BMAPI_STACK_SWITCH, then we either need to
> > > solve either the stack overrun problem (not possible, AFAICT) or the
> > > metadata performance degradation as a result of always pushing
> > > allocation off into workqueues. So, unfortunately, until we have some
> > > other resolution, we stuck with it.... :/
> >
> > Yeah, I went off the rails there. I meant to suggest something more along the
> > lines of getting rid of the stack_switch member of the args structure, since
> > xfs_bmapi_write is the only caller of xfs_bmapi_allocate. But it didn't come
> > out that way... Anyway, what we have is just fine.
>
> Oh, ok, I see what you mean. The code currently has an extra level
> of indirection (XFS_BMAPI_STACK_SWITCH -> args->stack_switch).
Exactly. Thanks. ;)
> That
> is in line with all the other XFS_BMAPI_* flags, though. i.e. they
> are interface flags and not propagated down through the
> struct xfs_bmalloc arguments.
>
> To tell the truth, I've previously considered just passing a flags
> field down rather than the current single bit variables. I've never
> done it though, because (IMO) it doesn't really improve the code
> significantly or reduce the footprint of the structure. Perhaps we
> should look at that again now that the code has been significantly
> factored....
Some of the args style structures seem pretty sizeable, but I think I'm more
concerned about readability than size. It gets to be kind of a muddle in
through there copying all those flags around, and in this case we got bitten.
Regards,
Ben
More information about the xfs
mailing list