XFS umount with IO errors seems to lead to memory corruption
Dave Chinner
david at fromorbit.com
Tue Dec 10 18:40:39 CST 2013
[ Sorry, Alex, I missed your last email. Thanks for pinging me to
remind me to look at it. ]
On Tue, Dec 10, 2013 at 09:36:11AM +0200, Alex Lyakas wrote:
> Hi Dave,
> any insight on this issue? At least on the simpler reproduction with
> "error" DeviceMapper?
Yes, it does point to the problem.
> -----Original Message----- From: Alex Lyakas
> Sent: 24 November, 2013 12:27 PM
> To: Dave Chinner ; xfs at oss.sgi.com
> Cc: linux-xfs at vger.kernel.org
> Subject: Re: XFS umount with IO errors seems to lead to memory corruption
>
> Hi Dave,
> thank you for your comments.
>
> The test that I am doing is unmounting the XFS, while its underlying
> block device returns intermittent IO errors. The block device in
> question is a custom DeviceMapper target. It returns -ECANCELED in
> this case. Should I return some other errno instead?
> The same exact test works alright with ext4. It's unmount finishes,
> system seems to continue functioning normally and kmemleak is also
> happy.
>
> When doing a simpler reproductoin with "error" Device-Mapper, umount
> gets stuck and never returns, while kernel keeps printing:
> XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1
It's trying to write the superblock - it's and async, background
metadata write, and it's failing.
/*
* If the write was asynchronous then no one will be looking for the
* error. Clear the error state and write the buffer out again.
*
* XXX: This helps against transient write errors, but we need to find
* a way to shut the filesystem down if the writes keep failing.
*
* In practice we'll shut the filesystem down soon as non-transient
* erorrs tend to affect the whole device and a failing log write
* will make us give up. But we really ought to do better here.
*/
if (XFS_BUF_ISASYNC(bp)) {
ASSERT(bp->b_iodone != NULL);
trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag */
if (!XFS_BUF_ISSTALE(bp)) {
bp->b_flags |= XBF_WRITE | XBF_ASYNC | XBF_DONE;
xfs_buf_iorequest(bp);
} else {
xfs_buf_relse(bp);
}
return;
}
There's the problem code - it just keeps resubmitting the failed IO
and so never unlocks it and it never completes.
> this never returns and /proc shows:
> root at vc-00-00-1075-dev:~# cat /proc/2684/stack
> [<ffffffffa033ac6a>] xfs_ail_push_all_sync+0x9a/0xd0 [xfs]
> [<ffffffffa0330123>] xfs_unmountfs+0x63/0x160 [xfs]
> [<ffffffffa02ee265>] xfs_fs_put_super+0x25/0x60 [xfs]
> [<ffffffff8118fd12>] generic_shutdown_super+0x62/0xf0
> [<ffffffff8118fdd0>] kill_block_super+0x30/0x80
> [<ffffffff811903dc>] deactivate_locked_super+0x3c/0x90
> [<ffffffff81190d7e>] deactivate_super+0x4e/0x70
> [<ffffffff811ad086>] mntput_no_expire+0x106/0x160
> [<ffffffff811ae760>] sys_umount+0xa0/0xe0
> [<ffffffff816ab919>] system_call_fastpath+0x16/0x1b
> [<FFFfffffffffffff>] 0xffffffffffffffff
That's waiting for the superblock to be marked clean.
> And after some time, hung task warning shows:
> INFO: task kworker/2:1:39 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/2:1 D ffffffff8180cf00 0 39 2 0x00000000
> ffff88007c54db38 0000000000000046 000000027d003700 ffff88007fd03fc0
> ffff88007c54dfd8 ffff88007c54dfd8 ffff88007c54dfd8 0000000000013e40
> ffff88007c9e9710 ffff88007c4bdc40 00000000000000b8 7fffffffffffffff
> Call Trace:
> [<ffffffff816a1b99>] schedule+0x29/0x70
> [<ffffffff816a02d5>] schedule_timeout+0x1e5/0x250
> [<ffffffffa02f3987>] ? kmem_zone_alloc+0x67/0xe0 [xfs]
> [<ffffffff816798e6>] ? kmemleak_alloc+0x26/0x50
> [<ffffffff816a0f1b>] __down_common+0xa0/0xf0
> [<ffffffffa032f37c>] ? xfs_getsb+0x3c/0x70 [xfs]
> [<ffffffff816a0fde>] __down+0x1d/0x1f
> [<ffffffff81084591>] down+0x41/0x50
> [<ffffffffa02dcd44>] xfs_buf_lock+0x44/0x110 [xfs]
> [<ffffffffa032f37c>] xfs_getsb+0x3c/0x70 [xfs]
> [<ffffffffa033b4bc>] xfs_trans_getsb+0x4c/0x140 [xfs]
> [<ffffffffa032f06e>] xfs_mod_sb+0x4e/0xc0 [xfs]
> [<ffffffffa02e3b24>] xfs_fs_log_dummy+0x54/0x90 [xfs]
> [<ffffffffa0335bf8>] xfs_log_worker+0x48/0x50 [xfs]
> [<ffffffff81077a11>] process_one_work+0x141/0x4a0
> [<ffffffff810789e8>] worker_thread+0x168/0x410
> [<ffffffff81078880>] ? manage_workers+0x120/0x120
> [<ffffffff8107df10>] kthread+0xc0/0xd0
> [<ffffffff813a3ea4>] ? acpi_get_child+0x47/0x4d
> [<ffffffff813a3fb7>] ? acpi_platform_notify.part.0+0xbb/0xda
> [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0
> [<ffffffff816ab86c>] ret_from_fork+0x7c/0xb0
> [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0
And that's blocked on the superblock buffer because it hasn't been
unlocked due to the failing write not completing.
I'll have a think about how to fix it.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list