XFS hangs and freezes with LSI 9265-8i controller on high i/o

Matthew Whittaker-Williams matthew at xsnews.nl
Mon Jun 11 16:37:23 CDT 2012


Dear Developers,

We are running into some problems with xfs and the LSI 9265-8i Controller.

http://www.lsi.com/products/storagecomponents/Pages/MegaRAIDSAS9265-8i.aspx

When running high i/o on raid 6 array with this controller xfs freezes 
up and we get the following errors:

Linux sd69 3.4.1-custom #4 SMP Mon Jun 11 09:35:31 CEST 2012 x86_64 
GNU/Linux

[   62.911481] XFS (sda): Mounting Filesystem
[   63.212456] XFS (sda): Starting recovery (logdev: internal)
[   64.016420] XFS (sda): Ending recovery (logdev: internal)
[   64.020549] XFS (sdb): Mounting Filesystem
[   64.371207] XFS (sdb): Starting recovery (logdev: internal)
[   65.265051] XFS (sdb): Ending recovery (logdev: internal)
[ 6110.298886] INFO: task kworker/0:0:11244 blocked for more than 120 
seconds.
[ 6110.298942] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 6110.299000] kworker/0:0     D ffff8805ecf52880     0 11244      2 
0x00000000
[ 6110.299044]  ffff8805ecf52880 0000000000000046 0000000000000000 
ffffffff81613020
[ 6110.299107]  00000000000132c0 ffff880582d65fd8 00000000000132c0 
ffff880582d65fd8
[ 6110.299170]  00000000000132c0 ffff8805ecf52880 00000000000132c0 
ffff880582d64010
[ 6110.299233] Call Trace:
[ 6110.299266]  [<ffffffff8134d55a>] ? schedule_timeout+0x2d/0xd7
[ 6110.299305]  [<ffffffff810f62f5>] ? kmem_cache_alloc+0x2a/0xee
[ 6110.299358]  [<ffffffffa02cbff4>] ? kmem_zone_alloc+0x58/0x9e [xfs]
[ 6110.299395]  [<ffffffff8134de6b>] ? __down_common+0x93/0xe4
[ 6110.299443]  [<ffffffffa03062b0>] ? xfs_getsb+0x2f/0x5c [xfs]
[ 6110.299480]  [<ffffffff81057994>] ? down+0x27/0x37
[ 6110.299520]  [<ffffffffa02b81e7>] ? xfs_buf_lock+0x65/0xb2 [xfs]
[ 6110.299568]  [<ffffffffa03062b0>] ? xfs_getsb+0x2f/0x5c [xfs]
[ 6110.299613]  [<ffffffffa0312e3b>] ? xfs_trans_getsb+0xa5/0xf5 [xfs]
[ 6110.299663]  [<ffffffffa0306c9a>] ? xfs_mod_sb+0x43/0x10f [xfs]
[ 6110.299710]  [<ffffffffa02c70f6>] ? xfs_flush_inodes+0x23/0x23 [xfs]
[ 6110.299755]  [<ffffffffa02bcd06>] ? xfs_fs_log_dummy+0x61/0x75 [xfs]
[ 6110.299802]  [<ffffffffa0311978>] ? xfs_ail_min_lsn+0xd/0x2e [xfs]
[ 6110.299849]  [<ffffffffa02c7133>] ? xfs_sync_worker+0x3d/0x60 [xfs]
[ 6110.299888]  [<ffffffff812703b6>] ? powersave_bias_target+0x14b/0x14b
[ 6110.299924]  [<ffffffff8104fa39>] ? process_one_work+0x1cd/0x2eb
[ 6110.299960]  [<ffffffff8104fc85>] ? worker_thread+0x12e/0x249
[ 6110.299993]  [<ffffffff8104fb57>] ? process_one_work+0x2eb/0x2eb
[ 6110.300029]  [<ffffffff8104fb57>] ? process_one_work+0x2eb/0x2eb
[ 6110.300064]  [<ffffffff8105356e>] ? kthread+0x81/0x89
[ 6110.300098]  [<ffffffff813569a4>] ? kernel_thread_helper+0x4/0x10
[ 6110.300133]  [<ffffffff810534ed>] ? 
kthread_freezable_should_stop+0x53/0x53
[ 6110.300171]  [<ffffffff813569a0>] ? gs_change+0x13/0x13
[ 7547.340316] INFO: task kworker/0:0:11244 blocked for more than 120 
seconds.
[ 7547.340359] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 7547.340414] kworker/0:0     D ffff8805ecf52880     0 11244      2 
0x00000000
[ 7547.340458]  ffff8805ecf52880 0000000000000046 0000000000000000 
ffffffff81613020
[ 7547.340522]  00000000000132c0 ffff880582d65fd8 00000000000132c0 
ffff880582d65fd8
[ 7547.340585]  00000000000132c0 ffff8805ecf52880 00000000000132c0 
ffff880582d64010
[ 7547.340648] Call Trace:
[ 7547.340680]  [<ffffffff8134d55a>] ? schedule_timeout+0x2d/0xd7
[ 7547.340719]  [<ffffffff810f62f5>] ? kmem_cache_alloc+0x2a/0xee
[ 7547.340772]  [<ffffffffa02cbff4>] ? kmem_zone_alloc+0x58/0x9e [xfs]
[ 7547.340809]  [<ffffffff8134de6b>] ? __down_common+0x93/0xe4
[ 7547.340858]  [<ffffffffa03062b0>] ? xfs_getsb+0x2f/0x5c [xfs]
[ 7547.340895]  [<ffffffff81057994>] ? down+0x27/0x37
[ 7547.340934]  [<ffffffffa02b81e7>] ? xfs_buf_lock+0x65/0xb2 [xfs]
[ 7547.340983]  [<ffffffffa03062b0>] ? xfs_getsb+0x2f/0x5c [xfs]
[ 7547.341028]  [<ffffffffa0312e3b>] ? xfs_trans_getsb+0xa5/0xf5 [xfs]
[ 7547.341078]  [<ffffffffa0306c9a>] ? xfs_mod_sb+0x43/0x10f [xfs]
[ 7547.341126]  [<ffffffffa02c70f6>] ? xfs_flush_inodes+0x23/0x23 [xfs]
[ 7547.341170]  [<ffffffffa02bcd06>] ? xfs_fs_log_dummy+0x61/0x75 [xfs]
[ 7547.341217]  [<ffffffffa0311978>] ? xfs_ail_min_lsn+0xd/0x2e [xfs]
[ 7547.346755]  [<ffffffffa02c7133>] ? xfs_sync_worker+0x3d/0x60 [xfs]
[ 7547.346794]  [<ffffffff812703b6>] ? powersave_bias_target+0x14b/0x14b
[ 7547.346832]  [<ffffffff8104fa39>] ? process_one_work+0x1cd/0x2eb
[ 7547.346870]  [<ffffffff8104fc85>] ? worker_thread+0x12e/0x249
[ 7547.346905]  [<ffffffff8104fb57>] ? process_one_work+0x2eb/0x2eb
[ 7547.346940]  [<ffffffff8104fb57>] ? process_one_work+0x2eb/0x2eb
[ 7547.346976]  [<ffffffff8105356e>] ? kthread+0x81/0x89
[ 7547.347012]  [<ffffffff813569a4>] ? kernel_thread_helper+0x4/0x10
[ 7547.347048]  [<ffffffff810534ed>] ? 
kthread_freezable_should_stop+0x53/0x53
[ 7547.347085]  [<ffffffff813569a0>] ? gs_change+0x13/0x13
[ 9463.398196] INFO: task kworker/0:0:11244 blocked for more than 120 
seconds.
[ 9463.398270] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 9463.398325] kworker/0:0     D ffff8805ecf52880     0 11244      2 
0x00000000
[ 9463.398369]  ffff8805ecf52880 0000000000000046 0000000000000000 
ffffffff81613020
[ 9463.398433]  00000000000132c0 ffff880582d65fd8 00000000000132c0 
ffff880582d65fd8
[ 9463.398496]  00000000000132c0 ffff8805ecf52880 00000000000132c0 
ffff880582d64010
[ 9463.398559] Call Trace:
[ 9463.398592]  [<ffffffff8134d55a>] ? schedule_timeout+0x2d/0xd7
[ 9463.398630]  [<ffffffff810f62f5>] ? kmem_cache_alloc+0x2a/0xee
[ 9463.398683]  [<ffffffffa02cbff4>] ? kmem_zone_alloc+0x58/0x9e [xfs]
[ 9463.398720]  [<ffffffff8134de6b>] ? __down_common+0x93/0xe4
[ 9463.398768]  [<ffffffffa03062b0>] ? xfs_getsb+0x2f/0x5c [xfs]
[ 9463.398804]  [<ffffffff81057994>] ? down+0x27/0x37
[ 9463.398843]  [<ffffffffa02b81e7>] ? xfs_buf_lock+0x65/0xb2 [xfs]
[ 9463.398892]  [<ffffffffa03062b0>] ? xfs_getsb+0x2f/0x5c [xfs]
[ 9463.398937]  [<ffffffffa0312e3b>] ? xfs_trans_getsb+0xa5/0xf5 [xfs]
[ 9463.398987]  [<ffffffffa0306c9a>] ? xfs_mod_sb+0x43/0x10f [xfs]
[ 9463.399034]  [<ffffffffa02c70f6>] ? xfs_flush_inodes+0x23/0x23 [xfs]
[ 9463.399079]  [<ffffffffa02bcd06>] ? xfs_fs_log_dummy+0x61/0x75 [xfs]
[ 9463.399126]  [<ffffffffa0311978>] ? xfs_ail_min_lsn+0xd/0x2e [xfs]
[ 9463.399174]  [<ffffffffa02c7133>] ? xfs_sync_worker+0x3d/0x60 [xfs]
[ 9463.399211]  [<ffffffff8104fa39>] ? process_one_work+0x1cd/0x2eb
[ 9463.399246]  [<ffffffff8104fc85>] ? worker_thread+0x12e/0x249
[ 9463.399281]  [<ffffffff8104fb57>] ? process_one_work+0x2eb/0x2eb
[ 9463.399316]  [<ffffffff8104fb57>] ? process_one_work+0x2eb/0x2eb
[ 9463.399351]  [<ffffffff8105356e>] ? kthread+0x81/0x89
[ 9463.399385]  [<ffffffff813569a4>] ? kernel_thread_helper+0x4/0x10
[ 9463.399422]  [<ffffffff810534ed>] ? 
kthread_freezable_should_stop+0x53/0x53
[ 9463.399459]  [<ffffffff813569a0>] ? gs_change+0x13/0x13


We tried the following linux kernels but same errors occurs.

3.0.33
3.2.14
3.2.18
3.3.7
3.4.1

We also tried the 2.6.35-13 kernel but this kernel is unable to load see 
the disk even when modules are loaded.

See attachments for system information.

Note we tried several of the same controller but errors persist.

Could you have a look into this issue?

If you need any more information I am happy to provide it.

Thanks

Kind regards,

Matthew Whittaker-Williams




-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg.txt
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120611/2c4defb4/attachment-0005.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lsmod.txt
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120611/2c4defb4/attachment-0006.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lspci.txt
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120611/2c4defb4/attachment-0007.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mount.txt
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120611/2c4defb4/attachment-0008.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sysctl.txt
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120611/2c4defb4/attachment-0009.txt>


More information about the xfs mailing list