On Thu, Dec 16, 2010 at 10:38:47AM -0500, Christoph Hellwig wrote:
> On Mon, Dec 13, 2010 at 03:32:19PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > The xfaild often tries to rest to wait for congestion to pass of for
> > IO to complete, but is regularly woken in tail-pushing situations.
> > In severe cases, the xfsaild is getting woken tens of thousands of
> > times a second. Reduce the number needless wakeups by only waking
> > the xfsaild if the new target is larger than the old one. Further
> > make short sleeps uninterruptible as they occur when the xfsaild has
> > decided it needs to back off to allow some IO to complete and being
> > woken early is counter-productive.
> This patch causes softlockup warnings in xfsaild for various testcases
> on my 32-bit x86 VM, but the testcases continue otherwise normally.
> Example below:
> [ 361.692515] INFO: task xfsaild/vdb5:8705 blocked for more than 120 seconds.
> [ 361.697272] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [ 361.703929] xfsaild/vdb5 D 00000000 0 8705 2 0x00000000
> [ 361.708148] f4933f10 00000046 f4b37464 00000000 00000000 f4b37100
> f4b37100 00000046
> [ 361.711501] f4933eb4 00000046 f4b37100 c0936092 f4b37264 f4b37268
> 00000000 c0d52d00
> [ 361.714786] c0d52d08 c0e96c00 f5735d38 f4933ec0 f4b37100 f6946c00
> f4933eec c0160553
> [ 361.718120] Call Trace:
> [ 361.721856] [<c0936092>] ? _raw_spin_unlock_irq+0x22/0x30
> [ 361.723439] [<c0160553>] ? finish_task_switch+0x73/0x100
> [ 361.725056] [<c0160517>] ? finish_task_switch+0x37/0x100
> [ 361.726592] [<c09334b3>] ? schedule+0x263/0x9d0
> [ 361.727932] [<c0198f4b>] ? trace_hardirqs_off+0xb/0x10
> [ 361.729548] [<c0933f05>] schedule_timeout+0x185/0x250
> [ 361.731258] [<c09360d5>] ? _raw_spin_unlock_irqrestore+0x35/0x60
> [ 361.733037] [<c019c68b>] ? trace_hardirqs_on+0xb/0x10
> [ 361.734513] [<c04ed504>] xfsaild+0x54/0xc0
> [ 361.735786] [<c04ed4b0>] ? xfsaild+0x0/0xc0
> [ 361.737171] [<c0187634>] kthread+0x74/0x80
> [ 361.738446] [<c01875c0>] ? kthread+0x0/0x80
> [ 361.739987] [<c013507a>] kernel_thread_helper+0x6/0x1c
> [ 361.741589] no locks held by xfsaild/vdb5/8705.
So this is saying is that a 20ms uninterruptible sleep lasting for more
than 120s? Doesn't that imply some kind of scheduler starvation, not
an actual XFS problem?