[Top] [All Lists]

Re: [RFC] unifying write variants for filesystems

To: Miklos Szeredi <miklos@xxxxxxxxxx>
Subject: Re: [RFC] unifying write variants for filesystems
From: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date: Mon, 3 Feb 2014 15:33:23 +0000
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Jens Axboe <axboe@xxxxxxxxx>, Mark Fasheh <mfasheh@xxxxxxxx>, Joel Becker <jlbec@xxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, Sage Weil <sage@xxxxxxxxxxx>, Steve French <sfrench@xxxxxxxxx>, Dave Kleikamp <shaggy@xxxxxxxxxx>, Anton Altaparmakov <anton@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140203144155.GO24171@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
References: <CA+55aFw4LgyYEkygxHUnpKZg3jMACGzsyENc9a9rWFmLcaRefQ@xxxxxxxxxxxxxx> <20140118074649.GF10323@xxxxxxxxxxxxxxxxxx> <CA+55aFzM0N7WjqnLNnuqTkbj3iws9f3bYxei=ZBCM8hvps4zYg@xxxxxxxxxxxxxx> <20140118201031.GI10323@xxxxxxxxxxxxxxxxxx> <20140119051335.GN10323@xxxxxxxxxxxxxxxxxx> <20140120135514.GA21567@xxxxxxxxxxxxx> <CA+55aFzEA-eM9v2PvsWx4v4ANaKXuRGYyGCkegJg++rhtHvnig@xxxxxxxxxxxxxx> <20140201224301.GS10323@xxxxxxxxxxxxxxxxxx> <20140202192104.GA21959@xxxxxxxxxxxxxxxxxx> <20140203144155.GO24171@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: Al Viro <viro@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Feb 03, 2014 at 03:41:55PM +0100, Miklos Szeredi wrote:

> > BTW, is there any reason why fuse/dev.c doesn't use atomic kmaps for
> > everything?  After all, as soon as we'd done kmap() in there, we
> > grab a spinlock and don't drop it until just before kunmap().  With
> > nothing by memcpy() done in between...  Miklos?  AFAICS, we only win
> > from switching to kmap_atomic there - we can't block anyway, we don't
> > need it to be visible on other CPUs and nesting isn't a problem.
> > Looks like it'll be cheaper in highmem cases and do exactly the same
> > thing as now for non-highmem...  Comments?
> We don't hold the spinlock.   But regardless, I don't see any reason why it
> couldn't be atomic kmap.

Oh, right - lock_request() drops it.  Still, I don't see anything other
than copying between map and unmap, and not a lot of it either...

As for get_user_pages_fast()...  Why not do what mm/filemap.c does to
deal with the same issue?  Prefault, then lock the destination page,
kmap_atomic() and do __copy_from_user_inatomic().  If that fails (i.e.
if something has raced with us and evicted the source from page table),
shrug, unlock and repeat.

I do realize that you want to share code between the read and write sides
of the whole thing, but I'm not sure it's worth doing.  Almost everything
in that pile knows the direction - splitting a few low-level functions
into ..._in() and ..._out() variants (mostly along the checks already
in them) allows to separate these paths completely, at which point it
becomes possible to use copy-page-to-iov_iter, etc. to take care of
mapping, dealing with iovec components, etc.

What I want to do is to get a sane set of iov_iter primitives that could
be used for everything, without their users having to care about the
nature of iov_iter - iovec, array of <page,offset,size,how_to_steal>
quadruples, biovec, etc.  The interesting part of it is how to make
that set expressive enough, while keeping it reasonably sane.  And
fs/fuse/dev.c is one of the more interesting potential users out there...

I've a growing queue with the beginning of that stuff; so far it's mostly
preparatory bits and pieces.  Currently being tested: copy_page_to_iter()
(more or less similar to iov_iter_copy_to_..., but with saner interface
and dealing with the kmap, atomics, etc. without forcing the callers do
do that) with conversion of generic_file_aio_read() and friends to it.
If it survives the local beating, I'll start pushing it out (as
vfs.git#iov_iter); that pile is getting to potentially interesting bits...

<Prev in Thread] Current Thread [Next in Thread>