On Fri, 2010-11-26 at 23:39 +1100, Dave Chinner wrote:
> Christoph, Alex,
>
> Just so you know, I just had a forced shutdown deadlock like so:
I'll try to take a look at this the coming week. Thanks for
reporting it.
-Alex
> [ 276.038251] Filesystem "vdb": xfs_trans_ail_delete_bulk: attempting to
> delete a log item that is not in the AIL
> [ 276.039467] xfs_force_shutdown(vdb,0x8) called from line 562 of file
> fs/xfs/xfs_trans_ail.c. Return address = 0xffffffff814a0d6a
> [ 276.041085] Filesystem "vdb": xfs_inactive: xfs_trans_commit() returned
> error 5
> [ 276.042848] xfs_imap_to_bp: xfs_trans_read_buf()returned an error 5 on
> vdb. Returning error.
> [ 276.047176] xfs_imap_to_bp: xfs_trans_read_buf()returned an error 5 on
> vdb. Returning error.
> [ 338.176185] SysRq : Show Blocked State
> [ 338.176785] task PC stack pid father
> [ 338.177608] kworker/5:1 D 0000000000000001 0 390 2 0x00000000
> [ 338.178581] ffff88011b2898d0 0000000000000046 ffff88011b289830
> ffffffff810ae1f8
> [ 338.179662] 00000000001d2ac0 ffff88011b0ac9c0 ffff88011b0acd28
> ffff88011b289fd8
> [ 338.180024] ffff88011b0acd30 00000000001d2ac0 ffff88011b288010
> 00000000001d2ac0
> [ 338.180024] Call Trace:
> [ 338.180024] [<ffffffff810ae1f8>] ? sched_clock_cpu+0xb8/0x110
> [ 338.180024] [<ffffffff816ba34e>] ? do_raw_spin_unlock+0x5e/0xb0
> [ 338.180024] [<ffffffff81491d42>] _xfs_log_force+0x142/0x2b0
> [ 338.180024] [<ffffffff8107e7b0>] ? default_wake_function+0x0/0x20
> [ 338.180024] [<ffffffff816ba34e>] ? do_raw_spin_unlock+0x5e/0xb0
> [ 338.180024] [<ffffffff81492050>] xfs_log_force_umount+0x1a0/0x2d0
> [ 338.180024] [<ffffffff814a887b>] xfs_do_force_shutdown+0x6b/0x1a0
> [ 338.180024] [<ffffffff814a0d6a>] ? xfs_trans_ail_delete_bulk+0x13a/0x170
> [ 338.180024] [<ffffffff814a0d6a>] xfs_trans_ail_delete_bulk+0x13a/0x170
> [ 338.180024] [<ffffffff8147b1de>] xfs_efi_release+0x8e/0xa0
> [ 338.180024] [<ffffffff81b19d1b>] ? _raw_spin_unlock+0x2b/0x40
> [ 338.180024] [<ffffffff8147b216>] xfs_efd_item_committed+0x26/0x40
> [ 338.180024] [<ffffffff8149e9d8>] xfs_trans_committed_bulk+0x78/0x210
> [ 338.180024] [<ffffffff8148e57e>] ? xlog_state_do_callback+0x17e/0x3d0
> [ 338.180024] [<ffffffff810637c9>] ? kvm_clock_read+0x19/0x20
> [ 338.180024] [<ffffffff81041339>] ? sched_clock+0x9/0x10
> [ 338.180024] [<ffffffff810ae0d5>] ? sched_clock_local+0x25/0x90
> [ 338.180024] [<ffffffff810ae1f8>] ? sched_clock_cpu+0xb8/0x110
> [ 338.180024] [<ffffffff810b9c3d>] ? trace_hardirqs_off+0xd/0x10
> [ 338.180024] [<ffffffff810ae2bf>] ? local_clock+0x6f/0x80
> [ 338.180024] [<ffffffff8148e57e>] ? xlog_state_do_callback+0x17e/0x3d0
> [ 338.180024] [<ffffffff814928a2>] xlog_cil_committed+0x32/0xe0
> [ 338.180024] [<ffffffff8148e595>] xlog_state_do_callback+0x195/0x3d0
> [ 338.180024] [<ffffffff8148e8cd>] xlog_state_done_syncing+0xfd/0x130
> [ 338.180024] [<ffffffff8148f2ca>] xlog_iodone+0xba/0x150
> [ 338.180024] [<ffffffff814ae416>] xfs_buf_iodone_work+0x26/0x70
> [ 338.180024] [<ffffffff8109fa3d>] process_one_work+0x1ad/0x520
> [ 338.180024] [<ffffffff8109f9cf>] ? process_one_work+0x13f/0x520
> [ 338.180024] [<ffffffff814ae3f0>] ? xfs_buf_iodone_work+0x0/0x70
> [ 338.180024] [<ffffffff810a25d2>] worker_thread+0x172/0x400
> [ 338.180024] [<ffffffff810a2460>] ? worker_thread+0x0/0x400
> [ 338.180024] [<ffffffff810a6f66>] kthread+0xa6/0xb0
> [ 338.180024] [<ffffffff8103ae64>] kernel_thread_helper+0x4/0x10
> [ 338.180024] [<ffffffff81b1a110>] ? restore_args+0x0/0x30
> [ 338.180024] [<ffffffff810a6ec0>] ? kthread+0x0/0xb0
> [ 338.180024] [<ffffffff8103ae60>] ? kernel_thread_helper+0x0/0x10
>
> Yes, I know the trigger was a change I was testing, but the deadlock
> is caused by the fact we've tried to force the log iduring shutdown
> from inside the log IO completion context. It's gone to sleep
> waiting on the processing only it can complete. I haven't looked
> into this in any detail yet, just wanted to to make sure you guys
> know about it...
>
> Cheers,
>
> Dave.
>
|