xfs
[Top] [All Lists]

Re: [PATCH] dax: allow DAX to look up an inode's block device

To: Dan Williams <dan.j.williams@xxxxxxxxx>
Subject: Re: [PATCH] dax: allow DAX to look up an inode's block device
From: Matthew Wilcox <willy@xxxxxxxxxxxxxxx>
Date: Tue, 2 Feb 2016 18:52:43 -0500
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>, Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "J. Bruce Fields" <bfields@xxxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, Jan Kara <jack@xxxxxxxx>, Jeff Layton <jlayton@xxxxxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, linux-nvdimm <linux-nvdimm@xxxxxxxxxxx>, XFS Developers <xfs@xxxxxxxxxxx>, linux-btrfs@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CAPcyv4gqq0guubnddRPmDVA0N1=vfh3w6jPf4GsuYs0D29nS4w@xxxxxxxxxxxxxx>
References: <1454454702-11889-1-git-send-email-ross.zwisler@xxxxxxxxxxxxxxx> <20160202231931.GR17997@xxxxxxxxxxxxxxxxxx> <CAPcyv4gtcL_WwZpmiAUhO5h4q3YyXinkpz9SwKm5SBA9-1kE9Q@xxxxxxxxxxxxxx> <CAPcyv4gqq0guubnddRPmDVA0N1=vfh3w6jPf4GsuYs0D29nS4w@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.24 (2015-08-30)
On Tue, Feb 02, 2016 at 03:39:15PM -0800, Dan Williams wrote:
> On Tue, Feb 2, 2016 at 3:19 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > On Tue, Feb 02, 2016 at 04:11:42PM -0700, Ross Zwisler wrote:
> >> However, for raw block devices and for XFS with a real-time device, the
> >> value in inode->i_sb->s_bdev is not correct.  With the code as it is
> >> currently written, an fsync or msync to a DAX enabled raw block device will
> >> cause a NULL pointer dereference kernel BUG.  For this to work correctly we
> >> need to ask the block device or filesystem what struct block_device is
> >> appropriate for our inode.
> >>
> >> To that end, add a get_bdev(struct inode *) entry point to struct
> >> super_operations.  If this function pointer is non-NULL, this notifies DAX
> >> that it needs to use it to look up the correct block_device.  If
> >> i_sb->get_bdev() is NULL DAX will default to inode->i_sb->s_bdev.
> >
> > Umm...  It assumes that bdev will stay pinned for as long as inode is
> > referenced, presumably?  If so, that needs to be documented (and verified
> > for existing fs instances).  In principle, multi-disk fs might want to
> > support things like "silently move the inodes backed by that disk to other
> > ones"...
> 
> I assume btrfs is the only fs we have that might reassign the bdev for
> a given inode on the fly?  Hopefully we don't need anything stronger
> than rcu_read_lock() to pin the result as valid.
> 
> At least in this case the initial user is dax-fsync where the
> ->get_bdev() answer should be static for the life of the inode, and
> btrfs does not currently interface with dax.  But yes, we need to get
> the expected semantics clear.

Let's be clear though.  ->get_bdev is a temporary hack.  The need for
it goes away when DAX doesn't rely on being on a block_device any more.
I don't expect it to live longer than six months.

<Prev in Thread] Current Thread [Next in Thread>