xfs
[Top] [All Lists]

Re: [PATCH 3/6] xfs: add DAX file operations support

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 3/6] xfs: add DAX file operations support
From: Jan Kara <jack@xxxxxxx>
Date: Thu, 5 Mar 2015 12:05:04 +0100
Cc: Jan Kara <jack@xxxxxxx>, xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, willy@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150304220005.GV18360@dastard>
References: <1425425427-16283-1-git-send-email-david@xxxxxxxxxxxxx> <1425425427-16283-4-git-send-email-david@xxxxxxxxxxxxx> <20150304161848.GB2799@xxxxxxxxxxxxx> <20150304220005.GV18360@dastard>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu 05-03-15 09:00:05, Dave Chinner wrote:
> On Wed, Mar 04, 2015 at 05:18:48PM +0100, Jan Kara wrote:
> > On Wed 04-03-15 10:30:24, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > 
> > > Add the initial support for DAX file operations to XFS. This
> > > includes the necessary block allocation and mmap page fault hooks
> > > for DAX to function.
> > > 
> > > Note that the current block allocation code abuses the mapping
> > > buffer head to provide a completion callback for unwritten extent
> > > allocation when DAX is clearing blocks. The DAX interface needs to
> > > be changed to provide a callback similar to get_blocks for this
> > > callback.
> > > 
> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> .....
> > > +static int
> > > +xfs_filemap_dax_page_mkwrite(
> > > + struct vm_area_struct   *vma,
> > > + struct vm_fault         *vmf)
> > > +{
> > > + struct xfs_inode        *ip = XFS_I(vma->vm_file->f_mapping->host);
> > > + int                     error;
> > > +
> > > + trace_xfs_filemap_page_mkwrite(ip);
> > > +
> > > + xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
> >   So I think the lock ordering of XFS_MMAPLOCK and freezing protection is
> > suspicious (and actually so is for normal write faults as I'm looking -
> > didn't realize that when I was first reading your MMAPLOCK patches).
> > Because you take XFS_MMAPLOCK outside of freeze protection however usually
> > we want freeze protection to be the outermost lock - in particular in
> > xfs_file_fallocate() you take XFS_MMAPLOCK inside freeze protection I
> > think.
> 
> OK, so why isn't lockdep triggering on that? lockdep is aware of
> inode locks and the freeze states, supposedly to pick up these exact
> issues...
> 
> Oh, probably because the sb freeze order is write, pagefault,
> transaction.
> 
> i.e. In the fallocate case, we do sb_start_write, MMAP_LOCK. If we are in
> a freeze case, we aren't going to freeze page faults until we've
> frozen all the writes have drained, so there isn't a lock order
> dependency there. Same for any other mnt_want_write/sb-start_write
> based modification. 
> 
> Hence the fallocate path and anything that runs through setattr will
> complete and release the mmap lock and then be prevented from taking
> it again by the time sb_start_pagefault() can block with the mmap
> lock held.  So there isn't actually a deadlock there because of the
> way freeze works, and that's why lockdep is staying silent.
  Yeah, you're right there isn't a deadlock possibility. After all the lock
ranking of your MMAP_LOCk is currently the same as of mmap_sem (and the
difficult lock ordering of that semaphore has been the reason why we have
special type of freeze protection for page faults).

> Still, I probably need to fix it so I'm not leaving a potential
> landmine around.
  I would find it easier to grasp. Yes.

> > So you'll need to do what ext4 needs to do - take freeze protection, take
> > fs specific locks, and then call do_dax_fault(). Matthew has a patch to
> > actually export do_dax_fault (as __dax_fault()) for filesystems.
> 
> pointer to it? if none, I'll just write my own....
  http://permalink.gmane.org/gmane.comp.file-systems.ext4/47866

                                                                Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR

<Prev in Thread] Current Thread [Next in Thread>