[RFC PATCH 0/6] xfs: truncate vs page fault IO exclusion
Jan Kara
jack at suse.cz
Thu Jan 8 05:34:54 CST 2015
Hi,
On Thu 08-01-15 09:25:37, Dave Chinner wrote:
> This patch set is an attempt to address issues with XFS
> truncate and hole-punch code from racing with page faults that enter
> the IO path. This is traditionally deadlock prone due to the
> inversion of filesystem IO path locks and the mmap_sem.
>
> To avoid this issue, I have introduced a new "i_mmaplock" rwsem into
> the XFS code similar to the IO lock, but this lock is only taken in
> the mmap fault paths on entry into the filesystem (i.e. ->fault and
> ->page_mkwrite).
>
> The concept is that if we invalidate the page cache over a range
> after taking both the existing i_iolock and the new i_mmaplock, we
> will have prevented any vector for repopulation of the page cache
> over the invalidated range until one of the io and mmap locks has
> been dropped. i.e. we can guarantee that both the syscall IO path
> and page faults won't race with whatever operation the filesystem is
> performing...
>
> The introduction of a new lock is necessary to avoid deadlocks due
> to mmap_sem entanglement. It has a defined lock order during page
> faults of:
>
> mmap_sem
> -> i_mmaplock (read)
> -> page lock
> -> i_ilock (get blocks)
>
> This lock is then taken by any extent manipulation code in XFS in
> addition to the IO lock which has the lock ordering of
>
> i_iolock (write)
> -> i_mmaplock (write)
> -> page lock (data writeback, page invalidation)
> -> i_lock (data writeback)
> -> i_lock (modification transaction)
>
> Hence we have consistent lock ordering (which has been validated so
> far by testing with lockdep enabled) for page fault IO vs
> truncate, hole punch, extent shifts, etc.
>
> This patchset passes xfstests and various benchmarks and stress
> workloads, so the real question is now:
>
> What have I missed?
>
> Comments, thoughts, flames?
I had a look at the patches and as far as I can tell this should work
fine (at least from the VFS / MM POV).
Honza
--
Jan Kara <jack at suse.cz>
SUSE Labs, CR
More information about the xfs
mailing list