On Fri, Jan 20, 2012 at 11:55:08AM +0800, Zheng Liu wrote:
> Hi all,
> Recently we encounter an issue in ext4. The issue is that, when we do a direct
> IO, ext4 will acquire inode->i_mutex in generic_file_aio_write(). It declines
> the performance. Here is the detailed conversation.
> I know that in xfs it uses i_iolock, which is a rw_semaphore, to make parallel
> operations in direct IO. But I have a question. As we do some read/write
> operations in direct IO, it seems that there has a window to cause data
Yes, there is. That's a feature, not a bug.
> For example, One thread does a write operation to overlay some
> data at a offset. Meanwhile another thread does a read operation at the same
> offset. We assume that write is earlier than read.
Your assumption is wrong.
> Hence, we should read new
> data. Although it is diffculty to occur, it is possible that read is issued to
> the disk firstly and we read old data. I don't know whether it exists or not
> xfs. Thank you.
Fundamentally, the result of concurrent read and write direct IO
operations to the same offset is undefined because the filesystem
has no control of IO reordering in lower layers of the storage
stack. IOWs, we give no guarantees for IO ordering or coherency of
concurrent direct IO to the same offset.
If your application requires this sort of coherency, then you either
need to use buffered IO or provide these coherency guarantees
yourself because direct IO doesn't provide them. File range locking
is an example of how your application can coordinate it's IO to
avoid this problem.