This series aims to serialise unaligned direct IOs to an inode to
avoid corruption caused by sub-block zeroing races. The previous
approaches at the direct IO layer fail because for !DIO_LOCKING
filesystems like XFS, there is no way we can track and serialise all
the direct IOs to a given inode in a race free manner. While we can
track them, we cannot close the races between mapping blocks and
tracked IO completion occuring before subsequent tracking lookups
without adding some kind of locking to the DIO layer. Hence for
!DIO_LOCKING users, unaligned direct IO needs to be serialised at a
Because the xfs_file_aio_write() path is so twisted and difficult to
follow, adding new locking cases to the code is difficult to verify
that it is correct in all cases. Hence the series starts by cleaning
up the code and splitting apart the direct IO and buffered IO paths
before adding the unaligned direct IO detection and serialisation.
The first patch fixes a sync write error handling bug - we should
consider pushing that to .38. The next patches factor code that is
common to write and splice into helpers. The direct and buffered IO
paths are then separated out and the common write checks and bounds
limiting is factored out into a helper.
Finally, the serialisation of unaligned direct IOs is added by a
big-hammer approach. That is, we take the i_mutex and
XFS_IOLOCK_EXCL and hold them across the unaligned IO submission.
This means that unaligned direct IO submission is serialised, and
non-AIO DIO is serialised completely.
For unaligned AIO DIO, this would only serialise the submission of
the DIO, leaving the sub-block zeroing races open for unaligned
writes into unwritten extents. To avoid this problem, we use
xfs_ioend_wait() to ensure all AIO writes have completed before we
submit the unaligned write. We do this wait holding the i_mutex so
we serialise against other unaligned AIO as there is no need to
serialise against aligned DIO.
- fix initial sync write error return fixup
- add new patch to abstract locking from read/write path and remove
the need for the need_i_mutex variable.