xfs
[Top] [All Lists]

Re: [RFC] unifying write variants for filesystems

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>, Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Subject: Re: [RFC] unifying write variants for filesystems
From: Dave Kleikamp <dave.kleikamp@xxxxxxxxxx>
Date: Mon, 03 Feb 2014 10:50:02 -0600
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Jens Axboe <axboe@xxxxxxxxx>, Steve French <sfrench@xxxxxxxxx>, Sage Weil <sage@xxxxxxxxxxx>, Mark Fasheh <mfasheh@xxxxxxxx>, xfs@xxxxxxxxxxx, Joel Becker <jlbec@xxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, Anton Altaparmakov <anton@xxxxxxxxxx>, Kent Overstreet <kmo@xxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140203151242.GA6868@xxxxxxxxxxxxx>
References: <20140114172033.GU10323@xxxxxxxxxxxxxxxxxx> <20140118064040.GE10323@xxxxxxxxxxxxxxxxxx> <CA+55aFw4LgyYEkygxHUnpKZg3jMACGzsyENc9a9rWFmLcaRefQ@xxxxxxxxxxxxxx> <20140118074649.GF10323@xxxxxxxxxxxxxxxxxx> <CA+55aFzM0N7WjqnLNnuqTkbj3iws9f3bYxei=ZBCM8hvps4zYg@xxxxxxxxxxxxxx> <20140118201031.GI10323@xxxxxxxxxxxxxxxxxx> <20140119051335.GN10323@xxxxxxxxxxxxxxxxxx> <20140120135514.GA21567@xxxxxxxxxxxxx> <CA+55aFzEA-eM9v2PvsWx4v4ANaKXuRGYyGCkegJg++rhtHvnig@xxxxxxxxxxxxxx> <20140201224301.GS10323@xxxxxxxxxxxxxxxxxx> <20140203151242.GA6868@xxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
On 02/03/2014 09:12 AM, Christoph Hellwig wrote:
> On Sat, Feb 01, 2014 at 10:43:01PM +0000, Al Viro wrote:
>> * WTF bother passing 'pos' separately?  It's the same mistake that was
>> made with ->aio_read/->aio_write and just as with those, *all* callers
>> provably have pos == iocb->ki_pos.
> 
> I think this landed with the initial aio support, which planned for
> allowing AIO retries for a workqueue with a partially incremented
> pos.  None of this ever got merged, probably because it was too ugly
> to live.

Yeah, when these patches were first written, AIO looked a lot different.

>> * We *definitely* want a variant structure with tag - unsigned long thing
>> was just plain insane.  I see at least two variants - array of iovecs
>> and array of (at least) triples <page, offset, length>.  Quite possibly -
>> quadruples, with "here's how to try to steal this page" thrown in, if
>> we want that as replacement for ->splice_write() as well (it looks like
>> the few instances that do steal on pipe-to-file splices could be dealt
>> with the same way as the dumb ones, provided that ->write_iter or whatever
>> we end up calling it is allowed to try and steal pages).   Possibly more
>> variants on the read side of things...  FWIW, I'm not sure that bio_vec
>> makes a lot of sense here.
> 
> bio_vec just is one of the many page+offset+len containers we have, I
> guess Dave took it because loop uses it.  We could either invent a new
> one here or finally have a common one for the different uses all over
> the kernel.

With Kent's immutable bio_vec changes, peeking inside the bio to get to
the bio_vec is uglier than it was before, so there's no need to stick
with that.

Shaggy

<Prev in Thread] Current Thread [Next in Thread>