[Top] [All Lists]

Re: Extreme I/O latency

To: Fredrik Tolf <fredrik@xxxxxxxxxxxxx>
Subject: Re: Extreme I/O latency
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 2 Oct 2012 12:20:41 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <alpine.DEB.2.02.1210020338580.3390@xxxxxxxxxxxxxxxxxxx>
References: <alpine.DEB.2.02.1210020338580.3390@xxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Oct 02, 2012 at 03:53:08AM +0200, Fredrik Tolf wrote:
> Dear list,
> I'm having some problems with a Linux system using XFS filesystems,
> on top of LVM, on top of mdraid, and I'm lacking ideas for how to
> proceed with debugging it. The problem manifests itself in that
> certain, simple I/O operations sometimes take extremely long to
> complete -- not seldomly up to 20-30 seconds!

What is a "simple IO operation"?

> I used to have lesser problems of a similar kind previously, but
> this extremeness only started showing up since I upgraded the system
> from Debian Lenny (using Linux 2.6.26) to Squeeze (using 2.6.32).
> I've since upgraded to 3.2.0, and now to 3.5.4, and they all exhibit
> the same problem.
> The process having the worst problems with it usually sees them when
> it calls upon Berkeley DB, the stack traces in which seems to tell
> me that it's trying to do mmap'ed I/O in its region files, so I can
> only assume that the stop happens when it's pulling in pages from
> disk. I can't say I know for sure, but I'm getting the feeling that
> it happens when some other process calls fdatasync() or somesuch
> operation. I get this feeling because the problems very often seem
> to happen exactly when I fetch a MySQL-backed webpage from the
> system's HTTP server (at which point mysqld syncs its data to disk
> after some session table update or the like).

So is causing random 4k write IO?

> Does anyone have any clue as to what might cause symptoms like
> these, or, if not, how I can debug the issue further? Admittedly,
> it's not as if I can be sure that the problem belongs with XFS
> proper rather than LVM or mdraid, but I have to being somewhere. At
> least XFS is the direct interface that my programs call before
> getting stuck. :)

More information about your setup needed and what is happening
during the hangs:


Also: ftrace or latencytop might point you at where the the latency
is occurring. Then we might have some idea of what is causing it.


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>