xfs
[Top] [All Lists]

Re: [RFC] unifying write variants for filesystems

To: Zach Brown <zab@xxxxxxxxxx>
Subject: Re: [RFC] unifying write variants for filesystems
From: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date: Wed, 5 Feb 2014 19:58:38 +0000
Cc: Kent Overstreet <kmo@xxxxxxxxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Jens Axboe <axboe@xxxxxxxxx>, Mark Fasheh <mfasheh@xxxxxxxx>, Joel Becker <jlbec@xxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, Sage Weil <sage@xxxxxxxxxxx>, Steve French <sfrench@xxxxxxxxx>, Anton Altaparmakov <anton@xxxxxxxxxx>, Dave Kleikamp <dave.kleikamp@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140204183609.GK10323@xxxxxxxxxxxxxxxxxx>
References: <CA+55aFzEA-eM9v2PvsWx4v4ANaKXuRGYyGCkegJg++rhtHvnig@xxxxxxxxxxxxxx> <20140201224301.GS10323@xxxxxxxxxxxxxxxxxx> <52EFC271.3090205@xxxxxxxxxx> <20140204124409.GG10323@xxxxxxxxxxxxxxxxxx> <20140204125220.GB12440@kmo-pixel> <20140204151728.GH10323@xxxxxxxxxxxxxxxxxx> <20140204172723.GA11325@xxxxxxxxxxxxxxxxxxxx> <20140204180040.GI10323@xxxxxxxxxxxxxxxxxx> <20140204183356.GB11325@xxxxxxxxxxxxxxxxxxxx> <20140204183609.GK10323@xxxxxxxxxxxxxxxxxx>
Sender: Al Viro <viro@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
        BTW, why do we still have generic_segment_checks()?
AFAICS, *all* paths leading to any ->aio_read/->aio_write
instances are either
        1) with KERNEL_DS (and base/len are verifiably sane in those
cases), or
        2) have iovec come from successful {compat,}rw_copy_check_uvector()
and through rw_verify_area(), or
        3) have single-element iovec with access_ok()/rw_verify_area()
checked directly, or
        4) have single-element iovec with base/len unchanged from
what had been passed to some ->read() or ->write() instance, in which
case the caller of that ->read() or ->write() has done access_ok/rw_verify_area

And yes, I can prove that for the current tree, modulo a couple of dumb
bugs with unchecked values coming via read_code().  Which is called
a couple of times per a.out execve() and should be using vfs_read() instead
of blindly calling ->read() - it's *not* a hot path and never had been one.
With that fixed, we have the following: and call of any instance of
->read()/->write()/->aio_read()/->aio_write() (be it direct or via method)
is guaranteed that
        * all segments it's asked to read/write will satisfy access_ok().
        * all segments it's asked to read/write will have non-negative
lengths.
        * total size of all segments will be at most MAX_RW_COUNT.
        * file offset won't go from negative to zero in the combined area;
unless the file has FMODE_UNSIGNED_OFFSET in ->f_mode, it won't go from
positive to negative either.

So what exactly does generic_segments_check() give us?  Is it just that
everybody went "well, maybe there's some weird path where we don't do
validation; let's leave it there"?  Linus?

<Prev in Thread] Current Thread [Next in Thread>