xfs
[Top] [All Lists]

Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io

To: "Williams, Dan J" <dan.j.williams@xxxxxxxxx>, "hch@xxxxxxxxxxxxx" <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: "Verma, Vishal L" <vishal.l.verma@xxxxxxxxx>
Date: Thu, 5 May 2016 21:42:12 +0000
Accept-language: en-US
Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-block@xxxxxxxxxxxxxxx" <linux-block@xxxxxxxxxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, "linux-nvdimm@xxxxxxxxxxx" <linux-nvdimm@xxxxxxxxxxx>, "linux-mm@xxxxxxxxx" <linux-mm@xxxxxxxxx>, "viro@xxxxxxxxxxxxxxxxxx" <viro@xxxxxxxxxxxxxxxxxx>, "axboe@xxxxxx" <axboe@xxxxxx>, "akpm@xxxxxxxxxxxxxxxxxxxx" <akpm@xxxxxxxxxxxxxxxxxxxx>, "linux-fsdevel@xxxxxxxxxxxxxxx" <linux-fsdevel@xxxxxxxxxxxxxxx>, "linux-ext4@xxxxxxxxxxxxxxx" <linux-ext4@xxxxxxxxxxxxxxx>, "david@xxxxxxxxxxxxx" <david@xxxxxxxxxxxxx>, "jack@xxxxxxx" <jack@xxxxxxx>, "matthew@xxxxxx" <matthew@xxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CAPcyv4gdmo5m=Arf5sp5izJfNaaAkaaMbOzud8KRcBEC8RRu1Q@xxxxxxxxxxxxxx>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@xxxxxxxxx> <1461878218-3844-6-git-send-email-vishal.l.verma@xxxxxxxxx> <5727753F.6090104@xxxxxxxxxxxxx> <20160505142433.GA4557@xxxxxxxxxxxxx> <CAPcyv4gdmo5m=Arf5sp5izJfNaaAkaaMbOzud8KRcBEC8RRu1Q@xxxxxxxxxxxxxx>
Thread-index: AQHRoZNfS9ZF3cQEEUydwj8b8xF2vZ+mRHyAgAShZYCAAA4/AIAAa/uA
Thread-topic: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
On Thu, 2016-05-05 at 08:15 -0700, Dan Williams wrote:
> On Thu, May 5, 2016 at 7:24 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx>
> wrote:
> > 
> > On Mon, May 02, 2016 at 06:41:51PM +0300, Boaz Harrosh wrote:
> > > 
> > > > 
> > > > All IO in a dax filesystem used to go through dax_do_io, which
> > > > cannot
> > > > handle media errors, and thus cannot provide a recovery path
> > > > that can
> > > > send a write through the driver to clear errors.
> > > > 
> > > > Add a new iocb flag for DAX, and set it only for DAX mounts. In
> > > > the IO
> > > > path for DAX filesystems, use the same direct_IO path for both
> > > > DAX and
> > > > direct_io iocbs, but use the flags to identify when we are in
> > > > O_DIRECT
> > > > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the
> > > > conventional
> > > > direct_IO path instead of DAX.
> > > > 
> > > Really? What are your thinking here?
> > > 
> > > What about all the current users of O_DIRECT, you have just made
> > > them
> > > 4 times slower and "less concurrent*" then "buffred io" users.
> > > Since
> > > direct_IO path will queue an IO request and all.
> > > (And if it is not so slow then why do we need dax_do_io at all?
> > > [Rhetorical])
> > > 
> > > I hate it that you overload the semantics of a known and expected
> > > O_DIRECT flag, for special pmem quirks. This is an incompatible
> > > and unrelated overload of the semantics of O_DIRECT.
> > Agreed - makig O_DIRECT less direct than not having it is plain
> > stupid,
> > and I somehow missed this initially.
> Of course I disagree because like Dave argues in the msync case we
> should do the correct thing first and make it fast later, but also
> like Dave this arguing in circles is getting tiresome.
> 
> > 
> > This whole DAX story turns into a major nightmare, and I fear all
> > our
> > hodge podge tweaks to the semantics aren't helping it.
> > 
> > It seems like we simply need an explicit O_DAX for the read/write
> > bypass if can't sort out the semantics (error, writer
> > synchronization)
> > just as we need a special flag for MMAP.
> I don't see how O_DAX makes this situation better if the goal is to
> accelerate unmodified applications...
> 
> Vishal, at least the "delete a file with a badblock" model will still
> work for implicitly clearing errors with your changes to stop doing
> block clearing in fs/dax.c.ÂÂThis combined with a new -EBADBLOCK (as
> Dave suggests) and explicit logging of I/Os that fail for this reason
> at least gives a chance to communicate errors in files to suitably
> aware applications / environments.

Agreed - I'll send out a series that has just the zeroing changes, and
drop the dax_io fallback/O_DIRECT tweak for now while we figure out the
right thing to do. That should get us to a place where we still have dax
in the presence of errors, and have _a_ path for recovery.

> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@xxxxxxxxxxxx
> https://lists.01.org/mailman/listinfo/linux-nvdimm
<Prev in Thread] Current Thread [Next in Thread>