Bad performance with XFS + 2.6.38 / 2.6.39
Yann Dupont
Yann.Dupont at univ-nantes.fr
Wed Jan 4 04:54:43 CST 2012
On 02/01/2012 17:08, Peter Grandi wrote:
> [ ... ]
>
>>> On two particular server, with recent kernels, I experience a
>>> much higher load than expected, but it's very hard to tell
>>> what's wrong. The system seems more in I/O wait. Older
>>> kernels (2.6.32.xx and 2.6.26.xx) gives better results.
> [ ... ]
>> When I go back to older kernels, the load go down. With newer
>> kernel, all is working well too, but load (as reported by
>> uptime) is higher.
> [ ... ]
>>> birnie:~/TRACE# uptime
>>> 11:48:34 up 17:18, 3 users, load average: 0.04, 0.18, 0.23
>
>>> penderyn:~/TRACE# uptime
>>> 11:48:30 up 23 min, 3 users, load average: 4.03, 3.82, 3.21
> [ ... ]
>
> But 'uptime' reports the load average, which is (roughly)
> processes actually running on the CPU. If the load average is
More or less. I generally have 5000+ processes on those servers. The
load generally reflect a mix between CPU usage (which is unchanged as
dovecot setup is unchanged) and I/O wait. So naively, I'll say if load
average is higher than usual, that's because I/O WAIT is higher.
As kernel had big changes, it could be XFS, but DM, or I/O scheduler as
well.
But it don't seems the case.
> higher, that usually means that the file system is running
> better, not worse.
If delivery is I/O bound, yes but that's not the case in this particular
setup.
It looks as if you are not clear whether you
> have a regression or an improvement.
I was just signaling an unusual load average, nothing else. As far as I
can see, response times are still correct. I'm not experiencing a
performance proble. I'm not the first author of the thread. I probably
should have changed the name of the thread, sorry for that.
>
> For a mail server the relevant metric is messages processed per
> second, or alternatively median and maximum times to process a
> message, rather than "average" processes running.
>
...
> So you are expecting for a large system critical problem for
> which you yourself do not have the resource to do testing to see
> quick response times over the Christmas and New Year period.
> What's your XFS Platinum Psychic Support Account number? :-)
I'm not expecting anything. I know open source. All is working fine,
thank you. I was just "upping" because I saw that my traces have been
downloaded last week. It's not always easy for non native speakers to
send mails without sounding agressive/offendant . If that was the case,I
can assure that was not the intent.
>
> BTW rereading the description of the setup:
>
>>>>>> Thoses servers are mail (dovecot) servers, with lots of
>>>>>> simultaneous imap clients (5000+) an lots of simultaneous
>>>>>> message delivery. These are linux-vservers, on top of LVM
>>>>>> volumes. The storage is SAN with 15k RPM SAS drives (and
>>>>>> battery backup). I know barriers were disabled in older
>>>>>> kernels, so with recents kernels, XFS volumes were mounted
>>>>>> with nobarrier.
>
>>>>> 1. What mailbox format are you using? Is this a constant
>>>>> or variable?
>>>> Maildir++
>
> I am stunned by the sheer (euphemism alert) audacity of it all.
> This setup is (euphemism alert) amazing.
Can you elaborate, please ?? This particular setup is running fine for 7
years now , has very finely scaled up (up to 70k mailboxes with a
similar setup for students) with little modifications (replacing
courrier by dovecot, and upgrading servers for example) and has proved
very stable since, despite numerous power outages, for example...
I can give you detailed setup if you want, off list, I think it has
nothing to do with xfs.
>
> Unfortunately the problem of large busy mailstores is vastly
> underestimated by many, and XFS has little to do with it.
>
really not sure I underestimate it, but I'll glad to hear your
recommendations. Offlist, I think.
Cheers,
--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont at univ-nantes.fr
More information about the xfs
mailing list