xfs
[Top] [All Lists]

Re: [PATCH 3/3] xfs: introduce per-inode DAX enablement

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 3/3] xfs: introduce per-inode DAX enablement
From: Dan Williams <dan.j.williams@xxxxxxxxx>
Date: Thu, 21 Jan 2016 14:53:06 -0800
Cc: XFS Developers <xfs@xxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, ext4@xxxxxxxxxxxxxxx, Matthew Wilcox <willy@xxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=WRt1KTIo5BUq7WRyiFygKHFhLkTM3n0uzsjmiz61hD0=; b=owmuK5CHMqgI/okY4iM2RazPFFtFs8uOcfZyPlK0Gw2rRnr37gxCmriBCyDMuB1Kv0 7Wfw69pcKW7McJbMjD4wlqspgds65X0mcZcoFoYPA2vLIYsAoDrOFBBI2d5r8/IViGUz uZPyAL08570Wmelf1t+DZuO5mkX2w6QAO0tMrIjTHiDfGslU1qn7NMHnRdPsAVUufvt4 uoERi34nV+HjOGDjMHks7NlIFNsG1tjII5uM3hvkVCYCJoc+xM45vnOb/DETxAjGpw6L y5GbJ20xzSBK3jZwOmK+cJoSApjLl0c22bejCbT9aOT2jrEz8wOLajfvYWR273z2qna9 MyeA==
In-reply-to: <20160121215820.GA6033@dastard>
References: <1451886892-15548-1-git-send-email-david@xxxxxxxxxxxxx> <1451886892-15548-4-git-send-email-david@xxxxxxxxxxxxx> <CAA9_cmdAYzf3DpjPRZWikNgmJT_sdDACXa_znz6j0oqmmRdLOA@xxxxxxxxxxxxxx> <20160121215820.GA6033@dastard>
On Thu, Jan 21, 2016 at 1:58 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Thu, Jan 21, 2016 at 08:37:11AM -0800, Dan Williams wrote:
>> On Sun, Jan 3, 2016 at 9:54 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > From: Dave Chinner <dchinner@xxxxxxxxxx>
>> >
>> > Rather than just being able to turn DAX on and off via a mount
>> > option, some applications may only want to enable DAX for certain
>> > performance critical files in a filesystem.
>> >
>> > This patch introduces a new inode flag to enable DAX in the v3 inode
>> > di_flags2 field. It adds support for setting and clearing flags in
>> > the di_flags2 field via the XFS_IOC_FSSETXATTR ioctl, and sets the
>> > S_DAX inode flag appropriately when it is seen.
>> >
>> > When this flag is set on a directory, it acts as an "inherit flag".
>> > That is, inodes created in the directory will automatically inherit
>> > the on-disk inode DAX flag, enabling administrators to set up
>> > directory heirarchies that automatically use DAX. Setting this flag
>> > on an empty root directory will make the entire filesystem use DAX
>> > by default.
>>
>> When switching from page-cache to DAX, don't we need to flush existing
>> page cache mappings and remap directly?  Or, is the thought that
>> userspace needs to comprehend the presence of mixed mappings after
>> changing S_DAX?
>
> The change should be transparent to userspace. In general, I don't
> expect users to change the behaviour of files that are in active use
> (why would you do that?).

If by accident someone tries to dynamically change S_DAX while
existing mappings are established I think the kernel should just
return EBUSY.  I was not proposing we support it as a first-class
operation.

> This patch is really just introducing the
> flag, the userspace API and making it propagate correctly via the
> on-disk format. We'll fix up whatever problems with switching it
> on/off dynamically as we go, like we do with most experimental
> features once the on-disk behaviour is sorted out.

Ok.

> i.e. I've already got a couple of fixes we need to add to this - the
> DAX flag is only valid on CRC enabled filesystems,

I assume for torn-write protection?  The CRC limitation makes sense,
but we theoretically could get the same effect by using a separate
logdev that does not tear writes, right?

> so we need to
> check that in the ioctl (general problem with using di_flags2 field,
> not DAX flag specific issue). Adding a code to sync and unmap when
> changing the flag is probably also necessary in the ioctl - I don't
> have code to do that yet, but I have been thinking about it...

Matthew and I have also talked about a modification of mincore(2) to
interrogate the effective mapping mode.  It seems we'll need that or
something like it given the growing list of caveats with setting up a
DAX mapping.

<Prev in Thread] Current Thread [Next in Thread>