Theo E. Schlossnagle [jesus@xxxxxxxxxx] wrote:
> Hello all,
> We have been running exim for a while and we run it on over 75 machines
> (Linux, BSDs Solaris). We recently started using SGI's xfs filesystem for
> most of our operations because of its speed and stability -- we are _very_
> happy with it. I have never had any problems with it ... until now.
> Exim v3.14,v3.22,v3.33 and Linux 2.4.2-xfs. The xfs parition in question is
> running atop a RAID-1 md device on two 9GB scsi drives.
> After running Exim with its spool directory on an xfs partition and under
> low load (100 messages/minute) I would soon get an Exim process spinning CPU
> bound and I could not kill it [kill -9 did nothing]. The system was stuck
> on disk writes (so any process that calls fsync or friends would get stuck
> in the run queue never to come out again.) No modified files were writted
> to disk (by any process) after this point. A reboot was required and
> restore "normal" operation.
> We tried many things to fix this with no success, but as soon as we
> configured exim to use a non xfs (ext2 in this case) mounted spool
> directory, the problem instantly disappeared.
> It looked as if the kernel had a thread stuck writing to or reading from the
> filesystem journal. If anyone knows a solution to this problem, I am all
> ears. Otherwise, steer clear of running you Exim spools on xfs.
> Theo Schlossnagle
> 1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
> 2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
> ## List details at http://www.exim.org/mailman/listinfo/exim-users Exim
> details at http://www.exim.org/ ##
You might try a newer version, of xfs, I have been running the CVS version
dated Aug 2nd
Though a production box, it's load is not what yours is, and is not running
SGI is very active supporting XFS.
*--* Mail: lawrence@xxxxxxxx
*--* Voice: 425.739.4247
*--* Fax: 425.827.9577
- - - - - - O t a k i n c . - - - - -