On Thu, Oct 24, 2013 at 01:48:03AM -0700, Christoph Hellwig wrote:
> On Thu, Oct 24, 2013 at 02:25:10PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > Page cache allocation doesn't always go through ->begin_write and
> > hence we don't always get the opportunity to set the allocation
> > context to GFP_NOFS. Failing to do this means we open up the direct
> > relcaim stack to recurse into the filesystem and consume a
> > significant amount of stack.
> > On RHEL6.4 kernels we are seeing ra_submit() and
> > generic_file_splice_read() from an nfsd context recursing into the
> > filesystem via the inode cache shrinker and evicting inodes. This is
> > causing truncation to be run (e.g EOF block freeing) and causing
> > bmap btree block merges and free space btree block splits to occur.
> > These btree manipulations are occurring with the call chain already
> > 30 functions deep and hence there is not enough stack space to
> > complete such operations.
> It seems like we really should fix this in the VFS as it could affect
> all non-trivial filesystems.
Sure, if you want to. But doing that shouldn't prevent this fix from
being committed in the mean time, especially as other filesystems
already use this method for avoiding these problems.