Bug 805 - Lockup when copying large file to XFS partition
: Lockup when copying large file to XFS partition
Status: RESOLVED FIXED
: XFS
XFS kernel code
: Current
: PC Linux
: P2 normal
: ---
Assigned To:
:
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-12-20 04:17 CST by
Modified: 2009-01-06 06:53 CST (History)


Attachments
find_get_pages debug patch (443 bytes, patch)
2009-01-02 11:22 CST, Christoph Hellwig
Details | Diff
Patch used against 2.6.27.10 (563 bytes, patch)
2009-01-02 15:59 CST, Peter Klotz
Details | Diff
Correct patch (1.48 KB, patch)
2009-01-03 07:22 CST, Christoph Hellwig
Details | Diff
The results of my tests using the correct patch (20.69 KB, text/plain)
2009-01-03 14:46 CST, Peter Klotz
Details
Patch by Nick Piggin who suspects a missing compiler barrier as cause (1.55 KB, patch)
2009-01-05 08:11 CST, Peter Klotz
Details | Diff
"Cleaner" patch proposed by Linus Torvalds (318 bytes, patch)
2009-01-05 16:49 CST, Peter Klotz
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-12-20 04:17:15 CST
A command like this

sudo dd if=/dev/zero of=/media/xfsdisk/zero bs=100M count=2000

causes lots of backtraces like these in /var/log/messages

[ 6278.172006] Pid: 7362, comm: dd Not tainted 2.6.27.10-custom #1
[ 6278.172006] RIP: 0010:[<ffffffff802a2a6c>]  [<ffffffff802a2a6c>]
find_get_pages+0x6c/0x110
[ 6278.172006] RSP: 0018:ffff8800b38a73e8  EFLAGS: 00000246
[ 6278.172006] RAX: ffff88013f4ee7f8 RBX: ffff8800b38a7428 RCX: 0000000000000000
[ 6278.172006] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffe20001b34100
[ 6278.172006] RBP: ffff8800badd3390 R08: ffffe20001b34108 R09: 000000000000000e
[ 6278.172006] R10: 0000000000000035 R11: 0000000001516336 R12: ffffffff8030ea3e
[ 6278.172006] R13: ffff8800b38a7368 R14: 01ffffff8037c627 R15: ffff8800b38a7378
[ 6278.172006] FS:  00007f1909df76e0(0000) GS:ffffffff806b1a80(0000)
knlGS:0000000000000000
[ 6278.172006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6278.172006] CR2: 00007f31252f4000 CR3: 0000000103428000 CR4: 00000000000026e0
[ 6278.172006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6278.172006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 6278.172006] 
[ 6278.172006] Call Trace:
[ 6278.172006]  [<ffffffff802a2a43>] ? find_get_pages+0x43/0x110
[ 6278.172006]  [<ffffffff802ad644>] ? pagevec_lookup+0x24/0x30
[ 6278.172006]  [<ffffffffa082301d>] ? xfs_cluster_write+0xad/0x180 [xfs]
[ 6278.172006]  [<ffffffffa0823588>] ? xfs_page_state_convert+0x498/0x760 [xfs]
[ 6278.172006]  [<ffffffffa08239b1>] ? xfs_vm_writepage+0x71/0x120 [xfs]
[ 6278.172006]  [<ffffffff802b0446>] ? shrink_page_list+0x586/0x790
[ 6278.172006]  [<ffffffff802b07f2>] ? shrink_inactive_list+0x1a2/0x4b0
[ 6278.172006]  [<ffffffff804e6276>] ? _spin_lock_irq+0x16/0x20
[ 6278.172006]  [<ffffffff802ab4b4>] ? get_dirty_limits+0x14/0x2b0
[ 6278.172006]  [<ffffffff802b0b7b>] ? shrink_zone+0x7b/0x160
[ 6278.172006]  [<ffffffff802b0d65>] ? do_try_to_free_pages+0x105/0x400
[ 6278.172006]  [<ffffffff802b1157>] ? try_to_free_pages+0x67/0x70
[ 6278.172006]  [<ffffffff802afce0>] ? isolate_pages_global+0x0/0x50
[ 6278.172006]  [<ffffffff802a95d7>] ? __alloc_pages_internal+0x237/0x510
[ 6278.172006]  [<ffffffff802cc32d>] ? alloc_pages_current+0xad/0x110
[ 6278.172006]  [<ffffffff802a2fd7>] ? __page_cache_alloc+0x67/0x80
[ 6278.172006]  [<ffffffff802a3bd3>] ? __grab_cache_page+0x63/0xb0
[ 6278.172006]  [<ffffffff8030c659>] ? block_write_begin+0x89/0xf0
[ 6278.172006]  [<ffffffffa082244a>] ? xfs_vm_write_begin+0x2a/0x30 [xfs]
[ 6278.172006]  [<ffffffffa0822040>] ? xfs_get_blocks+0x0/0x20 [xfs]
[ 6278.172006]  [<ffffffff802a3d68>] ? generic_file_buffered_write+0x148/0x6a0
[ 6278.172006]  [<ffffffff8023ef6e>] ? try_to_wake_up+0x11e/0x2e0
[ 6278.172006]  [<ffffffff804e637e>] ? _spin_lock+0xe/0x20
[ 6278.172006]  [<ffffffffa08099e9>] ? xfs_log_move_tail+0x139/0x190 [xfs]
[ 6278.172006]  [<ffffffffa082b353>] ? xfs_write+0x6b3/0x9b0 [xfs]
[ 6278.172006]  [<ffffffff8023e852>] ? check_preempt_wakeup+0x1a2/0x1f0
[ 6278.172006]  [<ffffffff802368d6>] ? __dequeue_entity+0x36/0x90
[ 6278.172006]  [<ffffffffa0826d08>] ? xfs_file_aio_write+0x58/0x60 [xfs]
[ 6278.172006]  [<ffffffff802dfcf9>] ? do_sync_write+0xf9/0x140
[ 6278.172006]  [<ffffffff804e3c39>] ? thread_return+0x3d/0x654
[ 6278.172006]  [<ffffffff8025ee00>] ? autoremove_wake_function+0x0/0x40
[ 6278.172006]  [<ffffffff803987e0>] ? __clear_user+0x40/0x70
[ 6278.172006]  [<ffffffff803987c1>] ? __clear_user+0x21/0x70
[ 6278.172006]  [<ffffffff80315536>] ? inotify_inode_queue_event+0x16/0x100
[ 6278.172006]  [<ffffffff80357cc9>] ? cap_file_permission+0x9/0x10
[ 6278.172006]  [<ffffffff80356cb6>] ? security_file_permission+0x16/0x20
[ 6278.172006]  [<ffffffff802e03bb>] ? vfs_write+0xcb/0x190
[ 6278.172006]  [<ffffffff802e0575>] ? sys_write+0x55/0x90
[ 6278.172006]  [<ffffffff8020c7aa>] ? system_call_fastpath+0x16/0x1b
[ 6278.172006] 
[ 6343.668002] BUG: soft lockup - CPU#0 stuck for 61s! [dd:7362]


It is a vanilla 2.6.27.10 x86_64 kernel. The filesystem was created with
mkfs.xfs 2.9.8.
------- Comment #1 From 2009-01-01 17:13:31 CST -------
Would you be able to test a patch to narrow down what's causing this inside
find_get_pages?
------- Comment #2 From 2009-01-02 01:55:58 CST -------
I'll do my best.
------- Comment #3 From 2009-01-02 11:22:44 CST -------
Created an attachment (id=254) [details]
find_get_pages debug patch

Thanks a lot! Please report what messages you get with the following patch.
------- Comment #4 From 2009-01-02 15:59:34 CST -------
Created an attachment (id=255) [details]
Patch used against 2.6.27.10

Your patch only seems to add a curly brace and an empty line.

Could you please take a look at my modified version and tell me if that was
your intention?
------- Comment #5 From 2009-01-02 16:06:22 CST -------
The result when applying your modified patch against 2.6.27.10 and executing dd:

Jan  2 21:47:13 asus kernel: [  186.435385] SGI XFS with ACLs, security
attributes, realtime, large block/inode numbers, no debug enabled
Jan  2 21:47:13 asus kernel: [  186.438825] SGI XFS Quota Management subsystem
Jan  2 21:47:13 asus kernel: [  186.458480] XFS mounting filesystem sdb
Jan  2 21:53:07 asus kernel: [  540.157548] unable to deref page
Jan  2 21:53:07 asus kernel: [  540.157561] unable to deref page
Jan  2 21:55:44 asus kernel: [  697.933008] Modules linked in: xfs i915 ...
Jan  2 21:55:44 asus kernel: [  697.933008] CPU 1:
Jan  2 21:55:44 asus kernel: [  697.933008] Modules linked in: xfs i915 ...
Jan  2 21:55:44 asus kernel: [  697.933008] Pid: 7352, comm: dd Not tainted
2.6.27.10-xfs #1
Jan  2 21:55:44 asus kernel: [  697.933008] RIP: 0010:[<ffffffff802a2a51>] 
[<ffffffff802a2a51>] find_get_pages+0x71/0x130
Jan  2 21:55:44 asus kernel: [  697.933008] RSP: ...


The backtrace that follows matches the one originally posted.
------- Comment #6 From 2009-01-03 07:22:17 CST -------
Created an attachment (id=256) [details]
Correct patch

This is the patch I intended to attach, I messed up quilt edit once again.

Can you see if you get more messages with that one?
------- Comment #7 From 2009-01-03 14:46:59 CST -------
Created an attachment (id=257) [details]
The results of my tests using the correct patch

The additional output makes the problem disappear. It seems it is some kind of
race condition that becomes unlikely due to the additional output.

I tried with 3 files of different size (200GB, 300GB and 900GB). From all
possible 4 paths only 2 appear in the output ("unable to deref page" and
"page_cache_get failed").
------- Comment #8 From 2009-01-03 16:59:15 CST -------
Removing just "page_cache_get failed" from the patch makes the problem reappear.

[  419.854040] unable to deref page
[  419.854054] unable to deref page
[  419.854057] unable to deref page
[  419.854060] unable to deref page
[  419.854063] unable to deref page
[  419.854066] unable to deref page
[ 1042.605002] BUG: soft lockup - CPU#1 stuck for 61s! [dd:7613]
...
[ 1042.605006] RIP: 0010:[<ffffffff802a2a68>]  [<ffffffff802a2a68>]
find_get_pages+0x88/0x170
[ 1042.605006] RSP: ...


So the 2 other paths do not contribute to the problem. They are never executed.

Branch page_cache_get_speculative() is definitely taken when the problem occurs
since otherwise the removal of printk() could have no effect.
------- Comment #9 From 2009-01-05 08:11:20 CST -------
Created an attachment (id=258) [details]
Patch by Nick Piggin who suspects a missing compiler barrier as cause

See this lkml thread:

http://lkml.org/lkml/2009/1/5/16

Tests are currently running and look promising...
------- Comment #10 From 2009-01-05 10:23:27 CST -------
All tests completed successfully. No lockups occurred.

I used a patched 2.6.27.10 x86_64 kernel.
------- Comment #11 From 2009-01-05 16:49:31 CST -------
Created an attachment (id=259) [details]
"Cleaner" patch proposed by Linus Torvalds

Can be found in the lkml thread previously mentioned:

http://lkml.org/lkml/2009/1/5/307

Tests are running...