Hello,
On Fri, Feb 13, 2015 at 10:20:44AM +0800, yy wrote:
> Dave,
>
> Thank you very much for your explanation.
>
> I hit this issue when run MySQL on XFS. Direct IO is very import for
> MySQL on XFS,but I canât found any document explanation this
> problem.Maybe this will cause great confusion for other MySQL users
> also, so maybe this problem should be explained in XFS document.
I don't think this is something that should be explained in XFS documentation,
but at filesystems documentation in general. Once xfs follows the POSIX
requirements, it's not a "out of standards" behavior, but otherwise, so, I agree
that this should be something documented, but not exactly in XFS itself.
> Best regards,
> yy
> ååéä
> åää: Dave Chinner<david@xxxxxxxxxxxxx>
> æää: yy<yy@xxxxxxxxxxx>
> æé: xfs<xfs@xxxxxxxxxxx>; Eric Sandeen<sandeen@xxxxxxxxxxx>;
> bfoster<bfoster@xxxxxxxxxx>
> åéæé: 2015å2æ13æ(åä)â05:04
> äé: Re: XFS buffer IO performance is very poor
> On Thu, Feb 12, 2015 at 02:59:52PM +0800, yy wrote:
> > In functionxfs_file_aio_read, will requestXFS_IOLOCK_SHARED lock
> > for both direct IO and buffered IO:
>
> > so write will prevent read in XFS.
> >
> > However, in function generic_file_aio_read for ext3, will not
> > lockinode-i_mutex, so write will not prevent read in ext3.
> >
> > I think this maybe the reason of poor performance for XFS. I do
> > not know if this is a bug, or design flaws of XFS.
>
> This is a bug and design flaw in ext3, and most other Linux
> filesystems. Posix states that write() must execute atomically and
> so no concurrent operation that reads or modifies data should should
> see a partial write. The linux page cache doesn't enforce this - a
> read to the same range as a write can return partially written data
> on page granularity, as read/write only serialise on page locks in
> the page cache.
>
> XFS is the only Linux filesystem that actually follows POSIX
> requirements here - the shared/exclusive locking guarantees that a
> buffer write completes wholly before a read is allowed to access the
> data. There is a down side - you can't run concurrent buffered reads
> and writes to the same file - if you need to do that then that's
> what direct IO is for, and coherency between overlapping reads and
> writes is then the application's problem, not the filesystem...
>
> Maybe at some point in the future we might address this with ranged
> IO locks, but there really aren't many multithreaded programs that
> hit this issue...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [1]david@xxxxxxxxxxxxx
>
> References
>
> 1. mailto:david@xxxxxxxxxxxxx
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
--
Carlos
|