xfs
[Top] [All Lists]

Re: [bug] deadlock in forced shutdown

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [bug] deadlock in forced shutdown
From: Alex Elder <aelder@xxxxxxx>
Date: Sat, 27 Nov 2010 09:59:47 -0600
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20101126123928.GI12187@dastard>
References: <20101126123928.GI12187@dastard>
Reply-to: aelder@xxxxxxx
On Fri, 2010-11-26 at 23:39 +1100, Dave Chinner wrote:
> Christoph, Alex,
> 
> Just so you know, I just had a forced shutdown deadlock like so:



I'll try to take a look at this the coming week.  Thanks for
reporting it.

                                        -Alex

> [  276.038251] Filesystem "vdb": xfs_trans_ail_delete_bulk: attempting to 
> delete a log item that is not in the AIL
> [  276.039467] xfs_force_shutdown(vdb,0x8) called from line 562 of file 
> fs/xfs/xfs_trans_ail.c.  Return address = 0xffffffff814a0d6a
> [  276.041085] Filesystem "vdb": xfs_inactive: xfs_trans_commit() returned 
> error 5
> [  276.042848] xfs_imap_to_bp: xfs_trans_read_buf()returned an error 5 on 
> vdb.  Returning error.
> [  276.047176] xfs_imap_to_bp: xfs_trans_read_buf()returned an error 5 on 
> vdb.  Returning error.
> [  338.176185] SysRq : Show Blocked State
> [  338.176785]   task                        PC stack   pid father
> [  338.177608] kworker/5:1   D 0000000000000001     0   390      2 0x00000000
> [  338.178581]  ffff88011b2898d0 0000000000000046 ffff88011b289830 
> ffffffff810ae1f8
> [  338.179662]  00000000001d2ac0 ffff88011b0ac9c0 ffff88011b0acd28 
> ffff88011b289fd8
> [  338.180024]  ffff88011b0acd30 00000000001d2ac0 ffff88011b288010 
> 00000000001d2ac0
> [  338.180024] Call Trace:
> [  338.180024]  [<ffffffff810ae1f8>] ? sched_clock_cpu+0xb8/0x110
> [  338.180024]  [<ffffffff816ba34e>] ? do_raw_spin_unlock+0x5e/0xb0
> [  338.180024]  [<ffffffff81491d42>] _xfs_log_force+0x142/0x2b0
> [  338.180024]  [<ffffffff8107e7b0>] ? default_wake_function+0x0/0x20
> [  338.180024]  [<ffffffff816ba34e>] ? do_raw_spin_unlock+0x5e/0xb0
> [  338.180024]  [<ffffffff81492050>] xfs_log_force_umount+0x1a0/0x2d0
> [  338.180024]  [<ffffffff814a887b>] xfs_do_force_shutdown+0x6b/0x1a0
> [  338.180024]  [<ffffffff814a0d6a>] ? xfs_trans_ail_delete_bulk+0x13a/0x170
> [  338.180024]  [<ffffffff814a0d6a>] xfs_trans_ail_delete_bulk+0x13a/0x170
> [  338.180024]  [<ffffffff8147b1de>] xfs_efi_release+0x8e/0xa0
> [  338.180024]  [<ffffffff81b19d1b>] ? _raw_spin_unlock+0x2b/0x40
> [  338.180024]  [<ffffffff8147b216>] xfs_efd_item_committed+0x26/0x40
> [  338.180024]  [<ffffffff8149e9d8>] xfs_trans_committed_bulk+0x78/0x210
> [  338.180024]  [<ffffffff8148e57e>] ? xlog_state_do_callback+0x17e/0x3d0
> [  338.180024]  [<ffffffff810637c9>] ? kvm_clock_read+0x19/0x20
> [  338.180024]  [<ffffffff81041339>] ? sched_clock+0x9/0x10
> [  338.180024]  [<ffffffff810ae0d5>] ? sched_clock_local+0x25/0x90
> [  338.180024]  [<ffffffff810ae1f8>] ? sched_clock_cpu+0xb8/0x110
> [  338.180024]  [<ffffffff810b9c3d>] ? trace_hardirqs_off+0xd/0x10
> [  338.180024]  [<ffffffff810ae2bf>] ? local_clock+0x6f/0x80
> [  338.180024]  [<ffffffff8148e57e>] ? xlog_state_do_callback+0x17e/0x3d0
> [  338.180024]  [<ffffffff814928a2>] xlog_cil_committed+0x32/0xe0
> [  338.180024]  [<ffffffff8148e595>] xlog_state_do_callback+0x195/0x3d0
> [  338.180024]  [<ffffffff8148e8cd>] xlog_state_done_syncing+0xfd/0x130
> [  338.180024]  [<ffffffff8148f2ca>] xlog_iodone+0xba/0x150
> [  338.180024]  [<ffffffff814ae416>] xfs_buf_iodone_work+0x26/0x70
> [  338.180024]  [<ffffffff8109fa3d>] process_one_work+0x1ad/0x520
> [  338.180024]  [<ffffffff8109f9cf>] ? process_one_work+0x13f/0x520
> [  338.180024]  [<ffffffff814ae3f0>] ? xfs_buf_iodone_work+0x0/0x70
> [  338.180024]  [<ffffffff810a25d2>] worker_thread+0x172/0x400
> [  338.180024]  [<ffffffff810a2460>] ? worker_thread+0x0/0x400
> [  338.180024]  [<ffffffff810a6f66>] kthread+0xa6/0xb0
> [  338.180024]  [<ffffffff8103ae64>] kernel_thread_helper+0x4/0x10
> [  338.180024]  [<ffffffff81b1a110>] ? restore_args+0x0/0x30
> [  338.180024]  [<ffffffff810a6ec0>] ? kthread+0x0/0xb0
> [  338.180024]  [<ffffffff8103ae60>] ? kernel_thread_helper+0x0/0x10
> 
> Yes, I know the trigger was a change I was testing, but the deadlock
> is caused by the fact we've tried to force the log iduring shutdown
> from inside the log IO completion context. It's gone to sleep
> waiting on the processing only it can complete. I haven't looked
> into this in any detail yet, just wanted to to make sure you guys
> know about it...
> 
> Cheers,
> 
> Dave.
> 



<Prev in Thread] Current Thread [Next in Thread>