xfs
[Top] [All Lists]

Re: [PATCH] dax: allow DAX to look up an inode's block device

To: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Subject: Re: [PATCH] dax: allow DAX to look up an inode's block device
From: Dan Williams <dan.j.williams@xxxxxxxxx>
Date: Tue, 2 Feb 2016 15:38:17 -0800
Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "J. Bruce Fields" <bfields@xxxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, Jan Kara <jack@xxxxxxxx>, Jeff Layton <jlayton@xxxxxxxxxxxxxxx>, Matthew Wilcox <willy@xxxxxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, linux-nvdimm <linux-nvdimm@xxxxxxxxxxx>, XFS Developers <xfs@xxxxxxxxxxx>, inux-btrfs@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=72MPA5GvdklrJwSVzh4uV3SPfp4aKl1fH2YRNPy2DIc=; b=OjaBC9oVrz0NSfQFnmMH+qT9umJJcMMGV8pJCxMtw9bn+88a1KgxL+uJdYtC2fXc+B BlSlGpJj1iLhP0Hhn41x5CHWYsAbceFfrHJ8MzTNqWQrLsiZMKAxGzoeHRWh/8gU/Ufw EI7OAUdwdenS9cBQC/Gy6E0Vd2xy8fiKSVAmg8pMn2zvRrnBFIup1LH4lI0Kxw11PYts sAPwU7lYd6s4hA+ZyU2dJIWDji+BrRqX/Vwb4ppVhmo/WEfz0ja4BFACbyOh+NCeK39b dVocoM0TYfX8ZbUVJBBsd8UunXv4Cu8ZFvxdyHYFeMqha97Fr3MUsMtEPeQta69tuFr6 epow==
In-reply-to: <20160202231931.GR17997@xxxxxxxxxxxxxxxxxx>
References: <1454454702-11889-1-git-send-email-ross.zwisler@xxxxxxxxxxxxxxx> <20160202231931.GR17997@xxxxxxxxxxxxxxxxxx>
[ adding btrfs ]

On Tue, Feb 2, 2016 at 3:19 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Feb 02, 2016 at 04:11:42PM -0700, Ross Zwisler wrote:
>
>> However, for raw block devices and for XFS with a real-time device, the
>> value in inode->i_sb->s_bdev is not correct.  With the code as it is
>> currently written, an fsync or msync to a DAX enabled raw block device will
>> cause a NULL pointer dereference kernel BUG.  For this to work correctly we
>> need to ask the block device or filesystem what struct block_device is
>> appropriate for our inode.
>>
>> To that end, add a get_bdev(struct inode *) entry point to struct
>> super_operations.  If this function pointer is non-NULL, this notifies DAX
>> that it needs to use it to look up the correct block_device.  If
>> i_sb->get_bdev() is NULL DAX will default to inode->i_sb->s_bdev.
>
> Umm...  It assumes that bdev will stay pinned for as long as inode is
> referenced, presumably?  If so, that needs to be documented (and verified
> for existing fs instances).  In principle, multi-disk fs might want to
> support things like "silently move the inodes backed by that disk to other
> ones"...

I assume btrfs is the only fs we have that might reassign the bdev for
a given inode on the fly?  Hopefully we don't need anything stronger
than rcu_read_lock() to pin the result as valid.

At least in this case the initial user is dax-fsync where the
->get_bdev() answer should be static for the life of the inode, and
btrfs does not currently interface with dax.  But yes, we need to get
the expected semantics clear.

<Prev in Thread] Current Thread [Next in Thread>