On Thu, Mar 31, 2011 at 03:18:55PM +0100, simon@xxxxxxxxxxxxxxxxx wrote:
> x86_64
> Storage is fibrechannel attached and the filesystem is hosted on a
> LVM block device that concatentates four partitions, so the block access
> is going via a stack of LVM, multipath and Q-logic drivers.
> Network is Intel 10G ethernet (gxbe driver)
> Kernel is 2.6.32 with Debian patches. (both kernels)
Yes, this very much looks like a stack overflow caused by direct
reclaim from a context with a lot of stack usage into a filesystem (XFS
in this case) with a deep storage stack underneath.
The fix for this is to disable direct reclaim, which the VM maintainers
refuse. We finally gave in and added a hack similar to the other
modern filesystems to prevent this from inside XFS.
Try backporting commits:
"xfs: skip writeback from reclaim context"
and
"xfs: allow writeback from kswapd"
from current mainline to avoid these kinds of issues.
|