[Top] [All Lists]

Re: [RFC] unifying write variants for filesystems

To: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Subject: Re: [RFC] unifying write variants for filesystems
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Mon, 3 Feb 2014 07:12:42 -0800
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Jens Axboe <axboe@xxxxxxxxx>, Steve French <sfrench@xxxxxxxxx>, Sage Weil <sage@xxxxxxxxxxx>, Dave Kleikamp <shaggy@xxxxxxxxxx>, Mark Fasheh <mfasheh@xxxxxxxx>, xfs@xxxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Joel Becker <jlbec@xxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, Anton Altaparmakov <anton@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140201224301.GS10323@xxxxxxxxxxxxxxxxxx>
References: <20140114172033.GU10323@xxxxxxxxxxxxxxxxxx> <20140118064040.GE10323@xxxxxxxxxxxxxxxxxx> <CA+55aFw4LgyYEkygxHUnpKZg3jMACGzsyENc9a9rWFmLcaRefQ@xxxxxxxxxxxxxx> <20140118074649.GF10323@xxxxxxxxxxxxxxxxxx> <CA+55aFzM0N7WjqnLNnuqTkbj3iws9f3bYxei=ZBCM8hvps4zYg@xxxxxxxxxxxxxx> <20140118201031.GI10323@xxxxxxxxxxxxxxxxxx> <20140119051335.GN10323@xxxxxxxxxxxxxxxxxx> <20140120135514.GA21567@xxxxxxxxxxxxx> <CA+55aFzEA-eM9v2PvsWx4v4ANaKXuRGYyGCkegJg++rhtHvnig@xxxxxxxxxxxxxx> <20140201224301.GS10323@xxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sat, Feb 01, 2014 at 10:43:01PM +0000, Al Viro wrote:
> * WTF bother passing 'pos' separately?  It's the same mistake that was
> made with ->aio_read/->aio_write and just as with those, *all* callers
> provably have pos == iocb->ki_pos.

I think this landed with the initial aio support, which planned for
allowing AIO retries for a workqueue with a partially incremented
pos.  None of this ever got merged, probably because it was too ugly
to live.

> * We *definitely* want a variant structure with tag - unsigned long thing
> was just plain insane.  I see at least two variants - array of iovecs
> and array of (at least) triples <page, offset, length>.  Quite possibly -
> quadruples, with "here's how to try to steal this page" thrown in, if
> we want that as replacement for ->splice_write() as well (it looks like
> the few instances that do steal on pipe-to-file splices could be dealt
> with the same way as the dumb ones, provided that ->write_iter or whatever
> we end up calling it is allowed to try and steal pages).   Possibly more
> variants on the read side of things...  FWIW, I'm not sure that bio_vec
> makes a lot of sense here.

bio_vec just is one of the many page+offset+len containers we have, I
guess Dave took it because loop uses it.  We could either invent a new
one here or finally have a common one for the different uses all over
the kernel.

<Prev in Thread] Current Thread [Next in Thread>