On 02/21/14 01:47, Bruno Prémont wrote:
A virtual server of mine stopped working properly yesterday because one
partition became corrupted (or corruption has been stumbled over).
Restarting the system any attempt to mount that partition (without
-o norecovery,ro) results in the following trace (transcribed):
XFS (sda5): Mounting Filesystem
XFS (sda5): Starting recovery (logdev: internal)
XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1602 of file
CPU: 0 PID: 606 Commm: mount Not tainted 3.13.0-hetzner #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
000000000002eb84 ffff88001dc53ab8 ffffffff813ca339 ffff88001dc53ad8
ffffffff81156d4a ffffffff8116d926 00000000000002a8 ffff88001dc53b68
ffffffff8116b8dd ffff88001dd7ccc0 0000000000000000 0000000000000001
[<ffffffff8116d926>] ? xfs_free_extent+0xd6/0x120
[<ffffffff810d5fa4>] ? kmem_cache_alloc+0xa4/0xb0
[<ffffffff81074709>] ? wake_up_bit+0x29/0x40
[<ffffffff811958ee>] ? xfs_iunlock+0x6e/0x90
[<ffffffff811e337c>] ? ida_get_new_above+0x21c/0x290
[<ffffffff81166de0>] ? xfs_parseargs+0xc10/0xc10
[<ffffffff810b3b43>] ? strndup_user+0x53/0x70
XFS (sda5): Failed to recover EFIs
XFS (sda5): log mount finish failed
curious on which version of Linux hit this problem?
After that the mount process remains in D state and any attempt to
xfs_repair that fileysystem blocks (reboot needed to do anything).
Is that expected or should the mount either completely fail, returning
proper error to mount and leave system in a state as if the mount had
never been attempted (except for the log messages)?
The xfs_ail_push_all_sync() is hanging because the EFI was not and will
not be removed. There is a patch for this problem, but is waiting for a
similar issue in xlog_cil_push() that would change the recovery patch.
From the cause of this, I guess it's some left-over of "unclean"
live migration of the KVM guest this system is running on some longer
time ago. After live migration some processes started dying weird
deaths. Rebooting the system worked fine by the time though.
The only major load on that system (not so heavy, about 10-20 IO-ops
per second on average, mostly writes) is updating RRD files and
running a slave MySQL (InnoDB) database.
I recovered the filesystem with xfs_repair -L /dev/sda5 though the
InnoDB state remaining is rather broken.
xfs_repair reported only claimed free space issues (I didn't save its
xfs mailing list