Nathan Scott wrote:
> On Wed, Nov 24, 2004 at 06:12:33PM +0100, Anders Saaby wrote:
>> Hi Lists, (XFS list CC'ed)
> Hi there,
> Yep, very interested. So, "serving IMAP from Maildirs" - from
> the filesystems perspective, can you describe that in detail for
> me? I would guess that means a shallow directory tree, with quite
> large directories (how large?) and many (how many?) small files?
> (how small on average?) How frequently are files added/removed?
OK - Here's the deal:
/hsphere/local/var/vpopmail/domains is holding the Maildirs
/var/qmail/queue is holding the mail queue for qmail.
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 29G 11G 17G 39% /
/dev/sda1 45M 30M 13M 70% /boot
/dev/sdb1 137G 71G 66G
/dev/sda6 2.0G 40M 1.9G 2% /var/qmail/queue
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda2 3908128 170504 3737624 5% /
/dev/sda1 12048 62 11986 1% /boot
/dev/sdb1 143372224 2891398 140480826
/dev/sda6 2096192 1556 2094636 1% /var/qmail/queue
Actually I think that it is the mailqueue mount which causes the errors.
Because we originally changed the Maildir mount from ext3 to XFS. But again
after ~17 hours it Oops'ed but still on kjournald (as all earlier
incidents). After this I changed the mailqueue mount from ext3 to XFS,
again the server failed but this time on XFS. So it seems that it is the
fairly small mailqueue mount who is responsible for the errors.
Here are some statistics for the mailserver:
~50-80 mails pr. minute delivered locally.
~150 mails pr. minute forwarded.
~80.000 Mailboxes (Maildirs) - So these are relatively small directories.
There are usually around 30-50 qmail-queue processes running (which is the
process writing to /var/qmail/queue)
> Is this easily reproducible for you? If so, can you send me
> enough details that I can try to reproduce it locally?
Well, This is as you can imagine a production server, and it has been
crashing quite consistently every 17-18 hours the last couple of weeks. I
haven't tried to set up a test environment yet, but will do this very soon.
- The server is now runnig a 2.4.28 kernel which now has been running for 37
hours without failure (Wohoo!). But as the 2.4 kernel lags some of 2.6's
performance, the server is under a fairly high pressure now - and we have
to get it back to 2.6 as soon as possible.
- Do you have some ideas for a test setup? - Im not really sure I am able to
reproduce the environment which the production mailserver is in. There is
just too many factors to account for.