Bad performance with XFS + 2.6.38 / 2.6.39

Dave Chinner david at fromorbit.com
Sun Dec 11 19:00:53 CST 2011


On Mon, Dec 12, 2011 at 08:40:15AM +0800, Xupeng Yun wrote:
> On Mon, Dec 12, 2011 at 07:39, Dave Chinner <david at fromorbit.com> wrote:
> >
> > > ====== XFS + 2.6.29 ======
> >
> > Read 21GB @ 11k iops, 210MB/s, av latency of 1.3ms/IO
> > Wrote 2.3GB @ 1250 iops, 20MB/s, av latency of 0.27ms/IO
> > Total 1.5m IOs, 95% @ <= 2ms
> >
> > > ====== XFS + 2.6.39 ======
> >
> > Read 6.5GB @ 3.5k iops, 55MB/s, av latency of 4.5ms/IO
> > Wrote 700MB @ 386 iops, 6MB/s, av latency of 0.39ms/IO
> > Total 460k IOs, 95% @ <= 10ms, 4ms > 50% < 10ms
> >
> > Looking at the IO stats there, this doesn't look to me like an XFS
> > problem. The IO times are much, much longer on 2.6.39, so that's the
> > first thing to understand. If the two tests are doing identical IO
> > patterns, then I'd be looking at validating raw device performance
> > first.
> >
> 
> Thank you Dave.
> 
> I also did raw device and ext4 performance test with 2.6.39, all these
> tests are
> doing identical IO patterns(non-buffered IO, 16 IO threads, 16KB block size,
> mixed random read and write, r:w=9:1):
> ====== raw device + 2.6.39 ======
> Read 21.7GB @ 11.6k IOPS , 185MB/s, av latency of 1.37 ms/IO
> Wrote 2.4GB @ 1.3k IOPS, 20MB/s, av latency of 0.095 ms/IO
> Total 1.5M IOs, @ 96% <= 2ms
> 
> ====== ext4 + 2.6.39 ======
> Read 21.7GB @ 11.6k IOPS , 185MB/s, av latency of 1.37 ms/IO
> Wrote 2.4GB @ 1.3k IOPS, 20MB/s, av latency of 0.1 ms/IO
> Total 1.5M IOs, @ 96% <= 2ms
> 
> ====== XFS + 2.6.39 ======
> Read 6.5GB @ 3.5k iops, 55MB/s, av latency of 4.5ms/IO
> Wrote 700MB @ 386 iops, 6MB/s, av latency of 0.39ms/IO
> Total 460k IOs, @ 95% <= 10ms, 4ms > 50% < 10ms

Oh, of course, now I remember what the problem is - it's a locking
issue that was fixed in 3.0.11, 3.1.5 and 3.2-rc1.

commit 0c38a2512df272b14ef4238b476a2e4f70da1479
Author: Dave Chinner <dchinner at redhat.com>
Date:   Thu Aug 25 07:17:01 2011 +0000

    xfs: don't serialise direct IO reads on page cache checks
    
    There is no need to grab the i_mutex of the IO lock in exclusive
    mode if we don't need to invalidate the page cache. Taking these
    locks on every direct IO effective serialises them as taking the IO
    lock in exclusive mode has to wait for all shared holders to drop
    the lock. That only happens when IO is complete, so effective it
    prevents dispatch of concurrent direct IO reads to the same inode.
    
    Fix this by taking the IO lock shared to check the page cache state,
    and only then drop it and take the IO lock exclusively if there is
    work to be done. Hence for the normal direct IO case, no exclusive
    locking will occur.
    
    Signed-off-by: Dave Chinner <dchinner at redhat.com>
    Tested-by: Joern Engel <joern at logfs.org>
    Reviewed-by: Christoph Hellwig <hch at lst.de>
    Signed-off-by: Alex Elder <aelder at sgi.com>

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com




More information about the xfs mailing list