xfs
[Top] [All Lists]

Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io

To: Boaz Harrosh <boaz@xxxxxxxxxxxxx>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Thu, 5 May 2016 07:24:33 -0700
Cc: Vishal Verma <vishal.l.verma@xxxxxxxxx>, linux-nvdimm@xxxxxxxxxxx, Jens Axboe <axboe@xxxxxx>, Jan Kara <jack@xxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Matthew Wilcox <matthew@xxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-mm@xxxxxxxxx, Al Viro <viro@xxxxxxxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5727753F.6090104@xxxxxxxxxxxxx>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@xxxxxxxxx> <1461878218-3844-6-git-send-email-vishal.l.verma@xxxxxxxxx> <5727753F.6090104@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.24 (2015-08-30)
On Mon, May 02, 2016 at 06:41:51PM +0300, Boaz Harrosh wrote:
> > All IO in a dax filesystem used to go through dax_do_io, which cannot
> > handle media errors, and thus cannot provide a recovery path that can
> > send a write through the driver to clear errors.
> > 
> > Add a new iocb flag for DAX, and set it only for DAX mounts. In the IO
> > path for DAX filesystems, use the same direct_IO path for both DAX and
> > direct_io iocbs, but use the flags to identify when we are in O_DIRECT
> > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the conventional
> > direct_IO path instead of DAX.
> > 
> 
> Really? What are your thinking here?
> 
> What about all the current users of O_DIRECT, you have just made them
> 4 times slower and "less concurrent*" then "buffred io" users. Since
> direct_IO path will queue an IO request and all.
> (And if it is not so slow then why do we need dax_do_io at all? [Rhetorical])
> 
> I hate it that you overload the semantics of a known and expected
> O_DIRECT flag, for special pmem quirks. This is an incompatible
> and unrelated overload of the semantics of O_DIRECT.

Agreed - makig O_DIRECT less direct than not having it is plain stupid,
and I somehow missed this initially.

This whole DAX story turns into a major nightmare, and I fear all our
hodge podge tweaks to the semantics aren't helping it.

It seems like we simply need an explicit O_DAX for the read/write
bypass if can't sort out the semantics (error, writer synchronization)
just as we need a special flag for MMAP..

<Prev in Thread] Current Thread [Next in Thread>