We've recently upgraded the OS on one of our servers, and since then
have been experiencing frequent stalls of the XFS filesystem on it.
Other filesystems on the machine seem to still respond fine while XFS
hangs. The stalls sometimes last for around 30 minutes, during which all
attempts to access that filesystem hang completely - after that, the
filesystem suddenly responds instantly again, as if there had never been
any problem. The dmesg is full of these messages while it stalls:
XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250)
These also occour from time to time without the filesystem stalling (or
at least it's not noticeable) - the messages appear about once in two
hours, the stalls about once a day.
Google did point me to some reports of these messages occouring at the
end of 2013, but the kernels in question should all have had the fixes
proposed back then - although one message back then suggested there were
more places where this problem could occour that were not fixed yet.
Kernels used were:
- Ubuntu 3.13.0-44 - shows stalls, according to
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1382333 has the fix
- Ubuntu 3.16.0-31 - shows stalls
- Ubuntu 3.2.0-various - no stalls in more than 1 year
We can actually still boot the machine with the 3.2.0 kernel, and it
will run absolutely fine, but as that kernel will not be supported
forever, I do not consider that a permanent solution.
The machine should not be low on memory, the disk array far from its
limits, and the I/O-load is mostly reads with very little writes, as
this is a public FTP server.
I have tried to collect some information, available at
Michael Meier, Zentrale Systeme
Regionales Rechenzentrum Erlangen
Martensstrasse 1, 91058 Erlangen, Germany
Tel.: +49 9131 85-28973, Fax: +49 9131 302941