xfs
[Top] [All Lists]

2.6.39-rc3, 2.6.39-rc4: XFS lockup - regression since 2.6.38

To: xfs-masters@xxxxxxxxxxx, xfs@xxxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Alex Elder <aelder@xxxxxxx>, Dave Chinner <dchinner@xxxxxxxxxx>
Subject: 2.6.39-rc3, 2.6.39-rc4: XFS lockup - regression since 2.6.38
From: Bruno Prémont <bonbons@xxxxxxxxxxxxxxxxx>
Date: Sat, 23 Apr 2011 22:44:03 +0200
Cc: linux-kernel@xxxxxxxxxxxxxxx
Hi,

Running 2.6.39-rc3+ and now again on 2.6.39-rc4+ (I've not tested -rc1
or -rc2) I've hit a "dying machine" where processes writing to disk end
up in D state.
From occurrence with -rc3+ I don't have logs as those never hit the disk,
for -rc4+ I have the following (sysrq+t was too big, what I have of it
misses a dozen of kernel tasks - if needed, please ask):

The -rc4 kernel is at commit 584f79046780e10cb24367a691f8c28398a00e84
(+ 1 patch of mine to stop disk on reboot),
full dmesg available if needed; kernel config attached (only selected
options). In case there is something I should do at next occurrence
please tell. Unfortunately I have no trigger for it and it does not
happen very often.

Thanks,
Bruno

[    0.000000] Linux version 2.6.39-rc4-00120-g73b5b55 (kbuild@neptune) (gcc 
version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) ) #12 Thu Apr 21 19:28:45 CEST 2011


[32040.120055] INFO: task flush-8:0:1665 blocked for more than 120 seconds.
[32040.120068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[32040.120077] flush-8:0       D 00000000  4908  1665      2 0x00000000
[32040.120099]  f55efb5c 00000046 00000000 00000000 00000000 00000001 e0382924 
00000000
[32040.120118]  f55efb0c f55efb5c 00000004 f629ba70 572f01a2 00001cfe f629ba70 
ffffffc0
[32040.120135]  f55efc68 f55efb30 f889d7f8 f55efb20 00000000 f55efc68 e0382900 
f55efc94
[32040.120153] Call Trace:
[32040.120220]  [<f889d7f8>] ? xfs_bmap_search_multi_extents+0x88/0xe0 [xfs]
[32040.120239]  [<c109ce1d>] ? kmem_cache_alloc+0x2d/0x110
[32040.120294]  [<f88c88ca>] ? xlog_space_left+0x2a/0xc0 [xfs]
[32040.120346]  [<f88c85cb>] xlog_wait+0x4b/0x70 [xfs]
[32040.120359]  [<c102ca00>] ? try_to_wake_up+0xc0/0xc0
[32040.120411]  [<f88c948b>] xlog_grant_log_space+0x8b/0x240 [xfs]
[32040.120464]  [<f88c936e>] ? xlog_grant_push_ail+0xbe/0xf0 [xfs]
[32040.120516]  [<f88c99db>] xfs_log_reserve+0xab/0xb0 [xfs]
[32040.120571]  [<f88d6dc8>] xfs_trans_reserve+0x78/0x1f0 [xfs]
[32040.120625]  [<f88c560a>] xfs_iomap_write_allocate+0x27a/0x3a0 [xfs]
[32040.120680]  [<f88df192>] xfs_map_blocks+0x1d2/0x210 [xfs]
[32040.120733]  [<f88dfbde>] xfs_vm_writepage+0x2be/0x4e0 [xfs]
[32040.120749]  [<c107ab4b>] __writepage+0xb/0x30
[32040.120760]  [<c107bb06>] write_cache_pages+0x156/0x350
[32040.120771]  [<c107ab40>] ? set_page_dirty+0x60/0x60
[32040.120784]  [<c107bd3b>] generic_writepages+0x3b/0x60
[32040.120836]  [<f88def91>] xfs_vm_writepages+0x21/0x30 [xfs]
[32040.120847]  [<c107bd77>] do_writepages+0x17/0x30
[32040.120859]  [<c10c1667>] writeback_single_inode+0x77/0x1d0
[32040.120869]  [<c10c1c36>] writeback_sb_inodes+0x96/0x130
[32040.120880]  [<c10c21ae>] writeback_inodes_wb+0x6e/0xf0
[32040.120890]  [<c10c2442>] wb_writeback+0x212/0x260
[32040.120901]  [<c10c2655>] wb_do_writeback+0x1c5/0x1d0
[32040.120912]  [<c10c26d1>] bdi_writeback_thread+0x71/0x130
[32040.120923]  [<c10c2660>] ? wb_do_writeback+0x1d0/0x1d0
[32040.120936]  [<c10458c4>] kthread+0x74/0x80
[32040.120946]  [<c1045850>] ? kthreadd+0xc0/0xc0
[32040.120959]  [<c13332f6>] kernel_thread_helper+0x6/0xd
[32040.120989] INFO: task kworker/0:2:6126 blocked for more than 120 seconds.
[32040.120996] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[32040.121003] kworker/0:2     D e05a3dfc  6400  6126      2 0x00000000
[32040.121024]  e05a3e6c 00000046 00002925 e05a3dfc c104b7f1 00000296 1b84e819 
e05a3e04
[32040.121042]  e05a3e1c e05a3e6c 00000004 f4502ec0 8c876f48 00001cfb f4502ec0 
2d2b9b93
[32040.121059]  0000189c dd283f80 00000000 0016e360 e05a3e50 c10086c7 e05a3ee0 
0009201a
[32040.121077] Call Trace:
[32040.121089]  [<c104b7f1>] ? sched_clock_tick+0x51/0x80
[32040.121105]  [<c10086c7>] ? time_cpufreq_notifier+0x57/0x120
[32040.121117]  [<c109ce1d>] ? kmem_cache_alloc+0x2d/0x110
[32040.121169]  [<f88c88ca>] ? xlog_space_left+0x2a/0xc0 [xfs]
[32040.121220]  [<f88c85cb>] xlog_wait+0x4b/0x70 [xfs]
[32040.121231]  [<c102ca00>] ? try_to_wake_up+0xc0/0xc0
[32040.121283]  [<f88c948b>] xlog_grant_log_space+0x8b/0x240 [xfs]
[32040.121335]  [<f88c936e>] ? xlog_grant_push_ail+0xbe/0xf0 [xfs]
[32040.121402]  [<f88c99db>] xfs_log_reserve+0xab/0xb0 [xfs]
[32040.121455]  [<f88d6dc8>] xfs_trans_reserve+0x78/0x1f0 [xfs]
[32040.121506]  [<f88bafe1>] xfs_fs_log_dummy+0x41/0x90 [xfs]
[32040.121559]  [<f88e9f12>] xfs_sync_worker+0x62/0x70 [xfs]
[32040.121570]  [<c104075c>] process_one_work+0xfc/0x320
[32040.121621]  [<f88e9eb0>] ? xfs_syncd_init+0xb0/0xb0 [xfs]
[32040.121633]  [<c1042206>] worker_thread+0x106/0x330
[32040.121643]  [<c1042100>] ? manage_workers+0x340/0x340
[32040.121654]  [<c10458c4>] kthread+0x74/0x80
[32040.121664]  [<c1045850>] ? kthreadd+0xc0/0xc0
[32040.121675]  [<c13332f6>] kernel_thread_helper+0x6/0xd
[32040.121685] INFO: task tar:9048 blocked for more than 120 seconds.
[32040.121692] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[32040.121699] tar             D e4907dac  5540  9048   9039 0x00000000
[32040.121720]  e4907db4 00200082 00000000 e4907dac f88c88ca f4a58d24 00000002 
00000000
[32040.121737]  e4907d64 e4907db4 00000004 e0ee2310 e349aa9d 00001cf9 e0ee2310 
e4907d94
[32040.121755]  008d71a4 00000000 00001fbd 00947400 f606d880 0000402e 00000000 
e4907db4
[32040.121772] Call Trace:
[32040.121823]  [<f88c88ca>] ? xlog_space_left+0x2a/0xc0 [xfs]
[32040.121876]  [<f88c936e>] ? xlog_grant_push_ail+0xbe/0xf0 [xfs]
[32040.121929]  [<f88c94f2>] xlog_grant_log_space+0xf2/0x240 [xfs]
[32040.121941]  [<c102ca00>] ? try_to_wake_up+0xc0/0xc0
[32040.121992]  [<f88c99db>] xfs_log_reserve+0xab/0xb0 [xfs]
[32040.122047]  [<f88d6dc8>] xfs_trans_reserve+0x78/0x1f0 [xfs]
[32040.122100]  [<f88dae3d>] xfs_create+0x14d/0x520 [xfs]
[32040.122152]  [<f88e69da>] xfs_vn_mknod+0x9a/0x180 [xfs]
[32040.122204]  [<f88e6ad8>] xfs_vn_mkdir+0x18/0x20 [xfs]
[32040.122216]  [<c10adf5e>] vfs_mkdir+0x6e/0xa0
[32040.122266]  [<f88e6ac0>] ? xfs_vn_mknod+0x180/0x180 [xfs]
[32040.122278]  [<c10b0928>] sys_mkdirat+0xc8/0xe0
[32040.122292]  [<c108e46f>] ? remove_vma+0x3f/0x60
[32040.122303]  [<c108f23c>] ? do_munmap+0x20c/0x2f0
[32040.122314]  [<c10b0960>] sys_mkdir+0x20/0x30
[32040.122325]  [<c1332dd0>] sysenter_do_call+0x12/0x26

Attachment: config
Description: Binary data

<Prev in Thread] Current Thread [Next in Thread>