Bugzilla – Bug 922
kernel hangs in xlog_grant_log_space
Last modified: 2012-11-15 16:57:10 CST
* Overview When running a large number of file operations, occassionally an XFS filesystem will cause the kernel to hang. This can be reproduced easily using a set of scripts that perform file operations and the XFS partition's logsize is set to a size of 576b. * Steps to Reproduce 1) mkfs.xfs -b size=1024 -l size=576b <dev path> 2) mount the volume 3) copy check-files, create-files, and copy-files to partition (in the attached archive) 4) run ./create-files 5) run ./copy-files 6) wait about 2-4 hours, the dots will stop printing and check dmesg * Actual Results An xfs kernel task hangs, and a backtrace occurs. Output has been placed in this bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498 In addition are logs that have been requested in this email thread: http://oss.sgi.com/archives/xfs/2012-04/msg00951.html * Expected Results This should run for a very long time, and not cause hangs. * Build Date & Platform This has been tested on the following kernels which all exhibit the same failures: - 3.2.0-24 (Ubuntu Precise) - 3.4.0-rc4 - 3.0.29 - 3.1.10 - 3.2.15 - 3.3.2 * Additional Information This is the related Ubuntu bug with some additional information: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498 This is an email thread describing the issue: http://oss.sgi.com/archives/xfs/2012-04/msg00951.html Not sure if the following bug is related or not: http://oss.sgi.com/bugzilla/show_bug.cgi?id=854 In addition there is an older thread that could also be related: http://oss.sgi.com/archives/xfs/2011-11/msg00401.html
Created attachment 304 [details] Script to create files for reproducer.
Created attachment 305 [details] Script to copy files and cause hang.
Is the customer using this crazy small log size, or was that done to make this reproduceable?
(In reply to comment #3) > Is the customer using this crazy small log size, or was that done to make this > reproduceable? This was done to reproduce the issue. Changing the log size to the minimum seemed to produce the backtrace as the original failure. The description here should have part of the backtrace here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498 The syslog here should be the backtrace using the reproducer: https://launchpadlibrarian.net/102951054/syslog This has also been reproduced on 3.4-rc4, 3.2.0-24-server (Ubuntu precise), and a few other versions in between. This thread has some good info on the case so far: http://oss.sgi.com/archives/xfs/2012-04/msg00953.html
Just tested this with the xfs tree commit fb59581404ab7ec5075299065c22cb211a9262a9 on Nov 12th 2012, and I can still reproduce this issue.