xfs
[Top] [All Lists]

Re: PROBLEM + POSS FIX: kernel stack overflow, xfs, many disks, heavy wr

To: Chris Mason <chris.mason@xxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>, John Berthels <john@xxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, Nick Gregory <nick@xxxxxxxxx>, Rob Sanderson <rob@xxxxxxxxx>, xfs@xxxxxxxxxxx, linux-mm@xxxxxxxxx
Subject: Re: PROBLEM + POSS FIX: kernel stack overflow, xfs, many disks, heavy write load, 8k stack, x86-64
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 12 Apr 2010 11:01:27 +1000
In-reply-to: <20100409181108.GG13327@think>
References: <20100407140523.GJ11036@dastard> <4BBCAB57.3000106@xxxxxxxxx> <20100407234341.GK11036@dastard> <20100408030347.GM11036@dastard> <4BBDC92D.8060503@xxxxxxxxx> <4BBDEC9A.9070903@xxxxxxxxx> <20100408233837.GP11036@dastard> <20100409113850.GE13327@think> <4BBF6C51.5030203@xxxxxxxxxxx> <20100409181108.GG13327@think>
User-agent: Mutt/1.5.20 (2009-06-14)
On Fri, Apr 09, 2010 at 02:11:08PM -0400, Chris Mason wrote:
> On Fri, Apr 09, 2010 at 01:05:05PM -0500, Eric Sandeen wrote:
> > Chris Mason wrote:
> > 
> > > shrink_zone on my box isn't 500 bytes, but lets try the easy stuff
> > > first.  This is against .34, if you have any trouble applying to .32,
> > > just add the word noinline after the word static on the function
> > > definitions.
> > > 
> > > This makes shrink_zone disappear from my check_stack.pl output.
> > > Basically I think the compiler is inlining the shrink_active_zone and
> > > shrink_inactive_zone code into shrink_zone.
> > > 
> > > -chris
> > > 
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index 79c8098..c70593e 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -620,7 +620,7 @@ static enum page_references 
> > > page_check_references(struct page *page,
> > >  /*
> > >   * shrink_page_list() returns the number of reclaimed pages
> > >   */
> > > -static unsigned long shrink_page_list(struct list_head *page_list,
> > > +static noinline unsigned long shrink_page_list(struct list_head 
> > > *page_list,
> > 
> > FWIW akpm suggested that I add:
> > 
> > /*
> >  * Rather then using noinline to prevent stack consumption, use
> >  * noinline_for_stack instead.  For documentaiton reasons.
> >  */
> > #define noinline_for_stack noinline
> > 
> > so maybe for a formal submission that'd be good to use.
> 
> Oh yeah, I forgot about that one.  If the patch actually helps we can
> switch it.

Well, given that the largest stack overflow reported was about 800
bytes, I don't think it's enough. All the fat has been trimmed from
XFS long ago, and there isn't that much in the generic code paths
to trim. And if we consider that this isn't including a significant
storage subsystem (i.e. NFS on top and stacked DM+MD+FC below), then
trimming a few hundred bytes is not enough to prevent an 8k stack
being blown sky high.

That is why I was saying I'm not sure what the best way to solve the
problem is - I've got a couple of ideas for fixing the problem in
XFS once and for all, but I'm not sure if they will fly or not
yet, let alone written any code....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>