On Thu, Jan 21, 2016 at 1:58 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Thu, Jan 21, 2016 at 08:37:11AM -0800, Dan Williams wrote:
>> On Sun, Jan 3, 2016 at 9:54 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>> > From: Dave Chinner <dchinner@xxxxxxxxxx>
>> >
>> > Rather than just being able to turn DAX on and off via a mount
>> > option, some applications may only want to enable DAX for certain
>> > performance critical files in a filesystem.
>> >
>> > This patch introduces a new inode flag to enable DAX in the v3 inode
>> > di_flags2 field. It adds support for setting and clearing flags in
>> > the di_flags2 field via the XFS_IOC_FSSETXATTR ioctl, and sets the
>> > S_DAX inode flag appropriately when it is seen.
>> >
>> > When this flag is set on a directory, it acts as an "inherit flag".
>> > That is, inodes created in the directory will automatically inherit
>> > the on-disk inode DAX flag, enabling administrators to set up
>> > directory heirarchies that automatically use DAX. Setting this flag
>> > on an empty root directory will make the entire filesystem use DAX
>> > by default.
>>
>> When switching from page-cache to DAX, don't we need to flush existing
>> page cache mappings and remap directly? Or, is the thought that
>> userspace needs to comprehend the presence of mixed mappings after
>> changing S_DAX?
>
> The change should be transparent to userspace. In general, I don't
> expect users to change the behaviour of files that are in active use
> (why would you do that?).
If by accident someone tries to dynamically change S_DAX while
existing mappings are established I think the kernel should just
return EBUSY. I was not proposing we support it as a first-class
operation.
> This patch is really just introducing the
> flag, the userspace API and making it propagate correctly via the
> on-disk format. We'll fix up whatever problems with switching it
> on/off dynamically as we go, like we do with most experimental
> features once the on-disk behaviour is sorted out.
Ok.
> i.e. I've already got a couple of fixes we need to add to this - the
> DAX flag is only valid on CRC enabled filesystems,
I assume for torn-write protection? The CRC limitation makes sense,
but we theoretically could get the same effect by using a separate
logdev that does not tear writes, right?
> so we need to
> check that in the ioctl (general problem with using di_flags2 field,
> not DAX flag specific issue). Adding a code to sync and unmap when
> changing the flag is probably also necessary in the ioctl - I don't
> have code to do that yet, but I have been thinking about it...
Matthew and I have also talked about a modification of mincore(2) to
interrogate the effective mapping mode. It seems we'll need that or
something like it given the growing list of caveats with setting up a
DAX mapping.
|