concurrent direct IO write in xfs
Zheng Da
zhengda1936 at gmail.com
Wed Jan 25 15:20:12 CST 2012
Hello Dave,
On Mon, Jan 23, 2012 at 10:54 PM, Dave Chinner <david at fromorbit.com> wrote:
>
> > >> > So the test case is pretty simple and I think it's easy to
> reproduce it.
> > >> > It'll be great if you can try the test case.
> > >>
> > >> Can you post your test code so I know what I test is exactly what
> > >> you are running?
> > >>
> > > I can do that. My test code gets very complicated now. I need to
> simplify
> > > it.
> > >
> > Here is the code. It's still a bit long. I hope it's OK.
> > You can run the code like "rand-read file option=direct pages=1048576
> > threads=8 access=write/read".
>
> With 262144 pages on a 2Gb ramdisk, the results I get on 3.2.0 are
>
> Threads Read Write
> 1 0.92s 1.49s
> 2 0.51s 1.20s
> 4 0.31s 1.34s
> 8 0.22s 1.59s
> 16 0.23s 2.24s
>
> the contention is on the ip->i_ilock, and the newsize update is one
> of the offenders It probably needs this change to
> xfs_aio_write_newsize_update():
>
> - if (new_size == ip->i_new_size) {
> + if (new_size && new_size == ip->i_new_size) {
>
> to avoid the lock being taken here.
>
> But all that newsize crap is gone in the current git Linus tree,
> so how much would that gains us:
>
> Threads Read Write
> 1 0.88s 0.85s
> 2 0.54s 1.20s
> 4 0.31s 1.23s
> 8 0.27s 1.40s
> 16 0.25s 2.36s
>
> Pretty much nothing. IOWs, it's just like I suspected - you are
> doing so many write IOs that you are serialising on the extent
> lookup and write checks which use exclusive locking..
>
> Given that it is 2 lock traversals per write IO, we're limiting at
> about 4-500,000 exclusive lock grabs per second and decreasing as
> contention goes up.
>
> For reads, we are doing 2 shared (nested) lookups per read IO, we
> appear to be limiting at around 2,000,000 shared lock grabs per
> second. Ahmdals law is kicking in here, but it means if we could
> make the writes to use a shared lock, it would at least scale like
> the reads for this "no metadata modification except for mtime"
> overwrite case.
>
> I don't think that the generic write checks absolutely need
> exclusive locking - we probably could get away with a shared lock
> and only fall back to exclusive when we need to do EOF zeroing.
> Similarly, for the block mapping code if we don't need to do
> allocation, a shared lock is all we need. So maybe in that case for
> direct IO when create == 1, we can do a read lookup first and only
> grab the lock exclusively if that falls in a hole and requires
> allocation.....
Do you think if you will provide a patch for the changes?
Thanks,
Da
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120125/d193a114/attachment.htm>
More information about the xfs
mailing list