https://bugzilla.kernel.org/show_bug.cgi?id=65321
Bug ID: 65321
Summary: XFS mount hangs with XFS_WANT_CORRUPTED_GOTO; impairs
system functionality
Product: File System
Version: 2.5
Kernel Version: 3.12.1
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: XFS
Assignee: xfs-masters@xxxxxxxxxxx
Reporter: christian@xxxxxxxxxx
Regression: No
I've got an corrupted XFS file system on an USB hard disk courtesy of a TV with
USB storage capability. This particular TV model is known to corrupt file
systems from time to time. The problem persists for a long time now and occurs
regularly. Usually, I can repair the file system by attaching the USB disk to
one of my Gentoo boxes, mounting them, perform an xfs_repair just for good
measure and deleting some metafiles. Those three steps solve various problems I
encountered so far.
This time it didn't solve anything. Instead three issues occurred before I
could implement those steps:
1) I could not mount the corrupted file system; the mount command hangs (nor
can I perform an "xfs_repair", it suggests to mount the partition to replay the
journal :)
2) only solution seems to be the use of 'xfs_repair -L' (but I'd like to keep
the data intact)
3) my USB subsystem hung completely; the kernel shows various issues, though it
is more or less usable not considering USB
Issue 3) is definitly because of the PAX features enabled. Using an kernel
without PAX/Grsecurity the USB subsystem shows no problems. I will report that
to the guys at grsecurity.net. Not your concern.
Issue 1) on the other hand might be in your domain, issue 2) probably the
domain of the manufacturer of the TV, but maybe you have an idea, which
refrains from using 'xfs_repair -L'.
I encounter the following symptoms:
a) "sync" command stalls; so does "xfs_repair" and another "mount" applied to
the corrupt partition; stalling means that I even cannot kill them by executing
"kill -9"
b) I cannot reboot cleanly anymore; console hangs right after "powering down
now" message; system is still running, I can log in at another console; but
ultimately have to flip the power switch to regain full functionality (I think
that is because a "sync" stalling.)
c) suspend-to-ram is not possible, sleep indicator LED is blinking (again,
maybe because of hanging at sync)
d) but, I can mount and unmount a remote sshfs, or any other file system after
the problem occurs
I have similar symptoms on two systems based upon Gentoo (one x86, one x86-64);
will attach both dmesg output, where the last line in each file is the result
from "uname -a". Both systems have a kernel with PAX enabled. But when
investigating the problem I relied on a non-hardened (vanilla) kernel, in which
the USB subsystem has no problems.
Trying the Gentoo kernel 3.4.67 the mount command does not hang but instead
provides the following error message
> mount: mount /dev/sdb1 on /mnt/standard failed: Die Struktur muss bereinigt
> werden
Translation would be like: "The structure has to be cleared". I will attach
that dmesg as well and the kernel's configuration. This behaviour is more like
what I'd expect. Though I hoped the disk might be salvageable without dropping
the journal. The problem that 'mount' hangs seems to exist at least as long as
3.10.1; I tried that kernel, too.
Trying a vanilla 3.12.1 kernel directly from kernel.org, the problem still
persists (mount command hangs). Again, I will attach that dmesg output and the
kernel's configuration.
Here is some information I gathered so far:
xfs_repair Version 3.1.10
I will attach a trace report as requested in
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
and the 'dmesg' output of 'echo w > /proc/sysrq-trigger'.
Regarding symptom a), I will attach the output of "ps aux | grep
\(mount\|xfs\)" showing that both processes are in uninterruptable sleep.
Sadly, because of the erratic nature of the TV, I cannot provide any steps to
reproduce the issue. The hard disk is not mine, but I will try to keep it
around to perform tests on the file system if requested.
--
You are receiving this mail because:
You are the assignee for the bug.
|