xfs
[Top] [All Lists]

Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io

To: "hch@xxxxxxxxxxxxx" <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: "Verma, Vishal L" <vishal.l.verma@xxxxxxxxx>
Date: Sun, 8 May 2016 18:42:37 +0000
Accept-language: en-US
Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-block@xxxxxxxxxxxxxxx" <linux-block@xxxxxxxxxxxxxxx>, "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, "linux-nvdimm@xxxxxxxxxxx" <linux-nvdimm@xxxxxxxxxxx>, "linux-mm@xxxxxxxxx" <linux-mm@xxxxxxxxx>, "viro@xxxxxxxxxxxxxxxxxx" <viro@xxxxxxxxxxxxxxxxxx>, "Williams, Dan J" <dan.j.williams@xxxxxxxxx>, "axboe@xxxxxx" <axboe@xxxxxx>, "akpm@xxxxxxxxxxxxxxxxxxxx" <akpm@xxxxxxxxxxxxxxxxxxxx>, "linux-fsdevel@xxxxxxxxxxxxxxx" <linux-fsdevel@xxxxxxxxxxxxxxx>, "linux-ext4@xxxxxxxxxxxxxxx" <linux-ext4@xxxxxxxxxxxxxxx>, "david@xxxxxxxxxxxxx" <david@xxxxxxxxxxxxx>, "jack@xxxxxxx" <jack@xxxxxxx>, "matthew@xxxxxx" <matthew@xxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20160508090115.GE15458@xxxxxxxxxxxxx>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@xxxxxxxxx> <1461878218-3844-6-git-send-email-vishal.l.verma@xxxxxxxxx> <5727753F.6090104@xxxxxxxxxxxxx> <20160505142433.GA4557@xxxxxxxxxxxxx> <CAPcyv4gdmo5m=Arf5sp5izJfNaaAkaaMbOzud8KRcBEC8RRu1Q@xxxxxxxxxxxxxx> <20160505152230.GA3994@xxxxxxxxxxxxx> <1462484695.29294.7.camel@xxxxxxxxx> <20160508090115.GE15458@xxxxxxxxxxxxx>
Thread-index: AQHRoZNfS9ZF3cQEEUydwj8b8xF2vZ+mRHyAgAShZYCAAA4/AIAAAfIAgABq2YCAA+GggIAAom4A
Thread-topic: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
On Sun, 2016-05-08 at 02:01 -0700, hch@xxxxxxxxxxxxx wrote:
> On Thu, May 05, 2016 at 09:45:07PM +0000, Verma, Vishal L wrote:
> > 
> > I'm not sure I completely understand how this will work? Can you
> > explain
> > a bit? Would we have to export rw_bytes up to layers above the pmem
> > driver? Where does get_user_pages come in?
> A DAX filesystem can directly use the nvdimm layer the same way btt
> doe,s what's the problem?

The BTT does rw_bytes through an internal-to-libnvdimm mechanism, but
rw_bytes isn't exported to the filesystem, currently.. To do this we'd
have to either add an rw_bytes to block device operations...or
something.

Another thing is rw_bytes currently doesn't do error clearing either.
We store badblocks at sector granularity, and like Dan said earlier,
that hides the clear_error alignment requirements and upper layers
don't have to be aware of it. To make rw_bytes clear sub-sector errors,
we'd have to change the granularity of bad-blocks, and make upper
layers aware of the clearing alignment requirements.

Using a block-write semantic for clearing hides all this away.

> 
> Re get_user_pages my idea was to simply use that to lock down the
> user
> pages so that we can call rw_bytes on it.ÂÂHow else would you do
> it?ÂÂDo
> a kmalloc, copy_from_user and then another memcpy?
<Prev in Thread] Current Thread [Next in Thread>