On 10/7/15 9:18 AM, Gleb Natapov wrote:
> Hello XFS developers,
>
> We are working on scylladb[1] database which is written using seastar[2]
> - highly asynchronous C++ framework. The code uses aio heavily: no
> synchronous operation is allowed at all by the framework otherwise
> performance drops drastically. We noticed that the only mainstream FS
> in Linux that takes aio seriously is XFS. So let me start by thanking
> you guys for the great work! But unfortunately we also noticed that
> sometimes io_submit() is executed synchronously even on XFS.
>
> Looking at the code I see two cases when this is happening: unaligned
> IO and write past EOF. It looks like we hit both. For the first one we
> make special afford to never issue unaligned IO and we use XFS_IOC_DIOINFO
> to figure out what alignment should be, but it does not help. Looking at the
> code though xfs_file_dio_aio_write() checks alignment against m_blockmask
> which
> is set to be sbp->sb_blocksize - 1, so aio expects buffer to be aligned to
> filesystem block size not values that DIOINFO returns. Is it intentional? How
> should our code know what it should align buffers to?
/* "unaligned" here means not aligned to a filesystem block */
if ((pos & mp->m_blockmask) || ((pos + count) & mp->m_blockmask))
unaligned_io = 1;
It should be aligned to the filesystem block size.
> Second one is harder. We do need to write past the end of a file, actually
> most of our writes are like that, so it would have been great for XFS to
> handle this case asynchronously.
You didn't say what kernel you're on, but these:
9862f62 xfs: allow appending aio writes
7b7a866 direct-io: Implement generic deferred AIO completions
hit kernel v3.15.
However, we had a bug report about this, and Brian has sent a fix
which has not yet been merged, see:
[PATCH 1/2] xfs: always drain dio before extending aio write submission
on this list last week.
With those 3 patches, things should just work for you I think.
-Eric
> Currently we are working to work around
> this by issuing truncate() (or fallocate()) on another thread and doing
> aio on a main thread only after truncate() is complete. It seams to be
> working, but is it guarantied that a thread issuing aio will never sleep
> in this case (may be new file size value needs to hit the disk and it is
> not guarantied that it will happen after truncate() returns, but before
> aio call)?
>
> [2] http://www.scylladb.com/
> [1] http://www.seastar-project.org/
>
> Thanks,
>
> --
> Gleb.
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
>
|