Re: 12x performance drop on md/linux+sw raid1 due to barriers [xfs]

To: Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx>
Subject: Re: 12x performance drop on md/linux+sw raid1 due to barriers [xfs]
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Sat, 06 Dec 2008 09:36:07 -0600
Cc: linux-raid@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, Alan Piszcz <ap@xxxxxxxxxxxxx>
In-reply-to: <alpine.DEB.1.10.0812060928030.14215@xxxxxxxxxxxxxxxx>
References: <alpine.DEB.1.10.0812060928030.14215@xxxxxxxxxxxxxxxx>
User-agent: Thunderbird (Macintosh/20081105)
Justin Piszcz wrote:
> Someone should write a document with XFS and barrier support, if I recall,
> in the past, they never worked right on raid1 or raid5 devices, but it
> appears now they they work on RAID1, which slows down performance ~12 times!!

What sort of document do you propose?  xfs will enable barriers on any
block device which will support them, and after:


[XFS] Disable queue flag test in barrier check.

xfs is able to determine, via a test IO, that md raid1 does pass
barriers through properly even though it doesn't set an ordered flag on
the queue.

> l1:~# /usr/bin/time tar xf linux- 
> 0.15user 1.54system 0:13.18elapsed 12%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+325minor)pagefaults 0swaps
> l1:~#
> l1:~# /usr/bin/time tar xf linux-
> 0.14user 1.66system 2:39.68elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+324minor)pagefaults 0swaps
> l1:~#
> Before:
> /dev/md2        /               xfs     defaults,noatime  0       1
> After:
> /dev/md2        /               xfs     
> defaults,noatime,nobarrier,logbufs=8,logbsize=262144 0 1

Well, if you're investigating barriers can you do a test with just the
barrier option change; though I expect you'll still find it to have a
substantial impact.

> There is some mention of it here:
> http://oss.sgi.com/projects/xfs/faq.html#wcache_persistent
> But basically I believe it should be noted in the kernel logs, FAQ or 
> somewhere
> because just through the process of upgrading the kernel, not changing fstab
> or any other part of the system, performance can drop 12x just because the
> newer kernels implement barriers.


printk(KERN_ALERT "XFS is now looking after your metadata very
carefully; if you prefer the old, fast, dangerous way, mount with -o


Really, this just gets xfs on md raid1 in line with how it behaves on
most other devices.

But I agree, some documentation/education is probably in order; if you
choose to disable write caches or you have faith in the battery backup
of your write cache, turning off barriers would be a good idea.  Justin,
it might be interesting to do some tests with:

barrier,   write cache enabled
nobarrier, write cache enabled
nobarrier, write cache disabled

a 12x hit does hurt though...  If you're really motivated, try the same
scenarios on ext3 and ext4 to see what the barrier hit is on those as well.


