[Top] [All Lists]

Re: XFS performance issues: O_DIRECT and Linux 2.6.6+

To: Nathan Scott <nathans@xxxxxxx>
Subject: Re: XFS performance issues: O_DIRECT and Linux 2.6.6+
From: James Foris <james.foris@xxxxxxxxxx>
Date: Tue, 14 Sep 2004 11:53:38 -0500
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <20040914095914.A4118499@xxxxxxxxxxxxxxxxxxxxxxxx>
References: <411A8410.2030000@xxxxxxxxxx> <20040910041106.GA14336@frodo> <4144B19A.2020407@xxxxxxxxxx> <4145D141.1040907@xxxxxxxxxx> <20040914095914.A4118499@xxxxxxxxxxxxxxxxxxxxxxxx>
Reply-to: james.foris@xxxxxxxxxx
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040624
Nathan Scott wrote:
Hi James,

On Mon, Sep 13, 2004 at 11:56:33AM -0500, James Foris wrote:

More correctly, it happened between 2.6.5 and 2.6.5-bk1

So..... something in the 2.6.5-bk1 patchset caused the change.
Any suggestions where to begin looking (other than fs/direct_io.x) ?


search for "direct" -- looks like -bk1 includes all the changes I
was refering to earlier (and a bunch more) :(  So, the needle is
somewhere in that haystack...

Yes, and no.

I think I have figured out what is happening - and I would consider it a bug.

I put a drop-through printk in mm/filemap.c to report when O_DIRECT hits the

             * If we get here for O_DIRECT writes then we must have fallen 
              * to buffered writes (block instantiation inside i_size).  So we 
              * the file data here, to try to honour O_DIRECT expectations.
              if (unlikely(file->f_flags & O_DIRECT) && written)
                     status = filemap_write_and_wait(mapping);

I noticed that when I re-ran the tests, I started seeing these markers despite 
boundry and size alligned.... So I ran the following sequence (with the results shown):

./write-bench --num-writes 1 --write-size 0x10000 --sync --direct /raw_data2/write.dat
wrote    0.062 MB in   1 writes, 0.062 MB/write, 0.001 sec; 56.408 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x18000 --sync --direct /raw_data2/write.dat
wrote    0.094 MB in   1 writes, 0.094 MB/write, 0.001 sec; 67.784 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x1c000 --sync --direct /raw_data2/write.dat
wrote    0.109 MB in   1 writes, 0.109 MB/write, 0.002 sec; 69.844 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x20000 --sync --direct /raw_data2/write.dat
wrote    0.125 MB in   1 writes, 0.125 MB/write, 0.002 sec; 76.359 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x24000 --sync --direct /raw_data2/write.dat
wrote    0.141 MB in   1 writes, 0.141 MB/write, 0.002 sec; 81.146 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x28000 --sync --direct /raw_data2/write.dat
wrote    0.156 MB in   1 writes, 0.156 MB/write, 0.002 sec; 85.475 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x2c000 --sync --direct /raw_data2/write.dat
wrote    0.172 MB in   1 writes, 0.172 MB/write, 0.023 sec; 7.482 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x30000 --sync --direct /raw_data2/write.dat
wrote    0.188 MB in   1 writes, 0.188 MB/write, 0.022 sec; 8.649 MB/s 0.01 % 

./write-bench --num-writes 1 --write-size 0x34000 --sync --direct /raw_data2/write.dat
wrote    0.203 MB in   1 writes, 0.203 MB/write, 0.016 sec; 13.043 MB/s 0.01 % 

So, what appears to be happening is that the new logic is treating ANYTHING past the first transaction (first 150K of the write) as a residual requiring buffering/fully syncronous
operation reguardless of  boundry or size - which says O_DIRECT no longer works 
files greater than 150K.

Does this analysis sound about right to you ?

Jim Foris


<Prev in Thread] Current Thread [Next in Thread>