xfs
[Top] [All Lists]

Re: [PATCH] xfs: pull up stack_switch check into xfs_bmapi_write

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: pull up stack_switch check into xfs_bmapi_write
From: Ben Myers <bpm@xxxxxxx>
Date: Fri, 18 Jan 2013 16:55:43 -0600
Cc: Brian Foster <bfoster@xxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130118001549.GL2498@dastard>
References: <1358446289-871-1-git-send-email-bfoster@xxxxxxxxxx> <20130117214108.GF27055@xxxxxxx> <20130117233922.GJ2498@dastard> <20130117235039.GY30652@xxxxxxx> <20130118001549.GL2498@dastard>
User-agent: Mutt/1.5.20 (2009-06-14)
Hey,

On Fri, Jan 18, 2013 at 11:15:49AM +1100, Dave Chinner wrote:
> On Thu, Jan 17, 2013 at 05:50:39PM -0600, Ben Myers wrote:
> > Hey Dave,
> > 
> > On Fri, Jan 18, 2013 at 10:39:22AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 17, 2013 at 03:41:08PM -0600, Ben Myers wrote:
> > > > Hey Brian,
> > > > 
> > > > On Thu, Jan 17, 2013 at 01:11:29PM -0500, Brian Foster wrote:
> > > > > The stack_switch check currently occurs in __xfs_bmapi_allocate,
> > > > > which means the stack switch only occurs when xfs_bmapi_allocate()
> > > > > is called in a loop. Pull the check up before the loop in
> > > > > xfs_bmapi_write() such that the first iteration of the loop has
> > > > > consistent behavior.
> > > > > 
> > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > > > > ---
> > > > > 
> > > > > I was reading through this code and confused myself over whether the 
> > > > > stack
> > > > > switch ever actually occurs. Eric and Ben pointed out on irc 
> > > > > (simultaneously,
> > > > > I might add) the surrounding loop that I had missed, but it wasn't 
> > > > > clear whether
> > > > > the behavior to enable the stack switch after the first iteration was
> > > > > intentional or not. I'm throwing this out there to either fix the 
> > > > > issue or seek
> > > > > out an explanation for the existing behavior. Thanks!
> > > > 
> > > > To me this looks to be the correct behavior.  It might be better to
> > > > just get rid of the XFS_BMAPI_STACK_SWITCH flag entirely.  Nice find.
> > > 
> > > Which would take it back to the original logic which always switched
> > > stacks and we know that caused significant metadata performance
> > > degradation in various workloads.
> > > 
> > > If we want to remove XFS_BMAPI_STACK_SWITCH, then we either need to
> > > solve either the stack overrun problem (not possible, AFAICT) or the
> > > metadata performance degradation as a result of always pushing
> > > allocation off into workqueues. So, unfortunately, until we have some
> > > other resolution, we stuck with it.... :/
> > 
> > Yeah, I went off the rails there.  I meant to suggest something more along 
> > the
> > lines of getting rid of the stack_switch member of the args structure, since
> > xfs_bmapi_write is the only caller of xfs_bmapi_allocate.  But it didn't 
> > come
> > out that way...  Anyway, what we have is just fine.
> 
> Oh, ok, I see what you mean. The code currently has an extra level
> of indirection (XFS_BMAPI_STACK_SWITCH -> args->stack_switch).

Exactly.  Thanks.  ;)

> That
> is in line with all the other XFS_BMAPI_* flags, though. i.e. they
> are interface flags and not propagated down through the
> struct xfs_bmalloc arguments.
>
> To tell the truth, I've previously considered just passing a flags
> field down rather than the current single bit variables. I've never
> done it though, because (IMO) it doesn't really improve the code
> significantly or reduce the footprint of the structure. Perhaps we
> should look at that again now that the code has been significantly
> factored....

Some of the args style structures seem pretty sizeable, but I think I'm more
concerned about readability than size.  It gets to be kind of a muddle in
through there copying all those flags around, and in this case we got bitten.

Regards,
        Ben

<Prev in Thread] Current Thread [Next in Thread>