[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Exim and XFS filesystem



Hello all,

We have been running exim for a while and we run it on over 75 machines 
(Linux, BSDs Solaris).  We recently started using SGI's xfs filesystem for 
most of our operations because of its speed and stability -- we are _very_ 
happy with it.  I have never had any problems with it ... until now.

Exim v3.14,v3.22,v3.33 and Linux 2.4.2-xfs.  The xfs parition in question is 
running atop a RAID-1 md device on two 9GB scsi drives.

After running Exim with its spool directory on an xfs partition and under low 
load (100 messages/minute) I would soon get an Exim process spinning CPU bound 
and I could not kill it [kill -9 did nothing].  The system was stuck on disk 
writes (so any process that calls fsync or friends would get stuck in the run 
queue never to come out again.)  No modified files were writted to disk (by 
any process) after this point.  A reboot was required and restore "normal" 
operation.

We tried many things to fix this with no success, but as soon as we configured 
exim to use a non xfs (ext2 in this case) mounted spool directory, the problem 
instantly disappeared.

It looked as if the kernel had a thread stuck writing to or reading from the 
filesystem journal.  If anyone knows a solution to this problem, I am all 
ears.  Otherwise, steer clear of running you Exim spools on xfs.

-- 
Theo Schlossnagle
1024D/82844984/95FD 30F1 489E 4613 F22E  491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA  3D 90 B9 9F BE 27 24 E7