On Tue, Jul 15, 2008 at 12:12:52PM +1000, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote:
>>> This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the
>>> iolock and then flushes the file and because it has the iolock it doesn't
>>> expect any new delayed allocations to occur. A mmap write can allocate
>>> delayed allocations without acquiring the iolock so is able to get in
>>> after the flush but before the ASSERT.
>> Christoph and I were contemplating this problem with ->page_mkwrite
>> reecently. The problem is that we can't, right now, return an
>> EAGAIN-like error to ->page_mkwrite() and have it retry the
>> page fault. Other parts of the page faulting code can do this,
>> so it seems like a solvable problem.
>> The basic concept is that if we can return a EAGAIN result we can
>> try-lock the inode and hold the locks necessary to avoid this race
>> or prevent the page fault from dirtying the page until the
>> filesystem is unfrozen.
> Why do we need to try-lock the inode? Will we have an ABBA deadlock
> if we block on the iolock in ->page_mkwrite()?
Yes. With the mmap_sem. Look at the rules in mm/filemap.c
and replace i_mutex with iolock....