Re: Bad performance with XFS + 2.6.38 / 2.6.39

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: Bad performance with XFS + 2.6.38 / 2.6.39
From: Yann Dupont <Yann.Dupont@xxxxxxxxxxxxxx>
Date: Tue, 03 Jan 2012 09:20:05 +0100
Cc: stan@xxxxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
In-reply-to: <20120102203543.GP23662@dastard>
References: <CACaf2aYTsxOBXEJEbQu7gwAminBc3R2usDHvypJW0AqOfnz0Pg@xxxxxxxxxxxxxx> <20111212010053.GM14273@dastard> <CACaf2ab-YjXAFm767MmRU5iuOmvkqQW3ZTfQewD5SGvF-opgYQ@xxxxxxxxxxxxxx> <4EF1A224.2070508@xxxxxxxxxxxxxx> <4EF1F6DD.8020603@xxxxxxxxxxxxxxxxx> <4EF21DD2.3060004@xxxxxxxxxxxxxx> <20111221222623.GF23662@dastard> <4EF2F702.4050902@xxxxxxxxxxxxxx> <4EF30E5D.7060608@xxxxxxxxxxxxxx> <4F0181A2.5010505@xxxxxxxxxxxxxx> <20120102203543.GP23662@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20111124 Thunderbird/8.0
Le 02/01/2012 21:35, Dave Chinner a écrit :
On Mon, Jan 02, 2012 at 11:06:26AM +0100, Yann Dupont wrote:

Hello, happy new year everybody ,

Did someone had time to examine the 2 blktrace ? (and, by chance,
can see the root cause of the increased load ?)

I've had a bit of a look, but most peopl ehave been on holidays.

yep, of course, I was too :)

As it is, I can't see any material difference between the traces.
both reads and writes are taking the same amount of time to service,
so I don't think there's any problem here.


I do recall that some years ago that we changed one of the ways we

Do you recall exactly what some years ago means ? Is this post 2.6.26 era ?

slept in XFS which meant those blocked IOs contributed to load
average (as tehy are supposed to). That meant that more IO
contributed to the load average (it might have been read related),
so load averages were then higher for exactly the same workloads.


load average: 0.64, 0.15, 0.09

(start 40 concurrent directory traversals w/ unlinks)

(wait a bit)

load average: 39.96, 23.75, 10.06

Yup, that is spot on - 40 processes doing blocking IO.....

So absent any measurable performance problem, I don't think the
change in load average is something to be concerned about.

You're probably right : I have a graph on cacti showing load average usage and detailed load usage (System/User/Nice,Wait, etc...). The load average is much higher now with 3.1.6 , but the detailed load seems not different than before.

And for the moment, in real world usage (that is, storing mail in folders and serving imap) the server seems no slower than before.

I'll keep an eye on it during high load.

Thanks for your answer,
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : - Mail/Jabber : Yann.Dupont@xxxxxxxxxxxxxx

