On Mon, Jun 18, 2012 at 12:25:37PM -0600, Andreas Dilger wrote:
> On 2012-06-18, at 6:08 AM, Christoph Hellwig wrote:
> > May saw the release of Linux 3.4, including a decent sized XFS update.
> > Remarkable XFS features in Linux 3.4 include moving over all metadata
> > updates to use transactions, the addition of a work queue for the
> > low-level allocator code to avoid stack overflows due to extreme stack
> > use in the Linux VM/VFS call chain,
> This is essentially a workaround for too-small stacks in the kernel,
> which we've had to do at times as well, by doing work in a separate
> thread (with a new stack) and waiting for the results? This is a
> generic problem that any reasonably-complex filesystem will have when
> running under memory pressure on a complex storage stack (e.g. LVM +
> iSCSI), but causes unnecessary context switching.
> Any thoughts on a better way to handle this, or will there continue
> to be a 4kB stack limit and hack around this with repeated kmalloc
> on callpaths for any struct over a few tens of bytes, implementing
> memory pools all over the place, and "forking" over to other threads
> to continue the stack consumption for another 4kB to work around
> the small stack limit?
FWIW, I think your characterization of the problem as a 'workaround for
too-small stacks in the kernel' is about right. I don't think any of the XFS
folk were very happy about having to do this, but in the near term it doesn't
seem that we have a good alternative. I'm glad to see that there are others
with the same pain, so maybe we can build some support for upping the stack