On Thu, Jan 16, 2014 at 11:51:52AM -0800, Ivan Novick wrote:
> I am running a server with heavy workload on a XFS mount:
> /dev/mapper/v2-d1 on /d/d1 type xfs (rw,nodev,noatime,inode64,allocsize=16m)
> 2.6.32-424.el6.x86_64 #1 SMP Mon Oct 14 20:11:50 EDT 2013 x86_64 x86_64
> x86_64 GNU/Linux
> We get errors in log indicating processes are blocked for more than 120
> 1) Is this expected during heavy workload?
> 2) What would be the impact on the processes? Are they basically hung in
> userspace waiting for IO?
> 3) Is there anything we tune here?
> Below is the output.
> Ivan Novick
> INFO: task flush-253:1:6882 blocked for more than 120 seconds.
> Tainted: P --------------- 2.6.32-424.el6.x86_64 #1
Proprietary kernel module taint on a RHEL/centos kernel, so there's
no guarantee anyone will be able to debug this here. If it's RHEL
you are using, please report it through your support channels...
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> flush-253:1 D 0000000000000000 0 6882 2 0x00000000
> ffff881011fa5690 0000000000000046 0000000000000000 ffff881013cba040
> 0000000000000001 ffff88012a447d40 ffff881013cba040 ffff881016eb4830
> ffff881013cba5f8 ffff881011fa5fd8 000000000000fbc8 ffff881013cba5f8
> Call Trace:
> [<ffffffff815289d3>] io_schedule+0x73/0xc0
> [<ffffffff81267d18>] get_request_wait+0x108/0x1d0
> [<ffffffff8109b4a0>] ? autoremove_wake_function+0x0/0x40
> [<ffffffff812618ce>] ? elv_merge+0x17e/0x1c0
> [<ffffffff81267e79>] blk_queue_bio+0x99/0x620
> [<ffffffff81266f00>] generic_make_request+0x240/0x5a0
Waiting for IO completion. Looks like you've either severely
overloaded your storage subsystem, or it's died and isn't responding