[Top] [All Lists]

Re: [RFC] unifying write variants for filesystems

To: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Subject: Re: [RFC] unifying write variants for filesystems
From: Kent Overstreet <kmo@xxxxxxxxxxxxx>
Date: Thu, 6 Feb 2014 01:08:32 -0800
Cc: Zach Brown <zab@xxxxxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Jens Axboe <axboe@xxxxxxxxx>, Mark Fasheh <mfasheh@xxxxxxxx>, Joel Becker <jlbec@xxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, Sage Weil <sage@xxxxxxxxxxx>, Steve French <sfrench@xxxxxxxxx>, Anton Altaparmakov <anton@xxxxxxxxxx>, Dave Kleikamp <dave.kleikamp@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140205195838.GO10323@xxxxxxxxxxxxxxxxxx>
References: <20140201224301.GS10323@xxxxxxxxxxxxxxxxxx> <52EFC271.3090205@xxxxxxxxxx> <20140204124409.GG10323@xxxxxxxxxxxxxxxxxx> <20140204125220.GB12440@kmo-pixel> <20140204151728.GH10323@xxxxxxxxxxxxxxxxxx> <20140204172723.GA11325@xxxxxxxxxxxxxxxxxxxx> <20140204180040.GI10323@xxxxxxxxxxxxxxxxxx> <20140204183356.GB11325@xxxxxxxxxxxxxxxxxxxx> <20140204183609.GK10323@xxxxxxxxxxxxxxxxxx> <20140205195838.GO10323@xxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Feb 05, 2014 at 07:58:38PM +0000, Al Viro wrote:
>       BTW, why do we still have generic_segment_checks()?
> AFAICS, *all* paths leading to any ->aio_read/->aio_write
> instances are either
>       1) with KERNEL_DS (and base/len are verifiably sane in those
> cases), or
>       2) have iovec come from successful {compat,}rw_copy_check_uvector()
> and through rw_verify_area(), or
>       3) have single-element iovec with access_ok()/rw_verify_area()
> checked directly, or
>       4) have single-element iovec with base/len unchanged from
> what had been passed to some ->read() or ->write() instance, in which
> case the caller of that ->read() or ->write() has done 
> access_ok/rw_verify_area
> And yes, I can prove that for the current tree, modulo a couple of dumb
> bugs with unchecked values coming via read_code().  Which is called
> a couple of times per a.out execve() and should be using vfs_read() instead
> of blindly calling ->read() - it's *not* a hot path and never had been one.
> With that fixed, we have the following: and call of any instance of
> ->read()/->write()/->aio_read()/->aio_write() (be it direct or via method)
> is guaranteed that
>       * all segments it's asked to read/write will satisfy access_ok().
>       * all segments it's asked to read/write will have non-negative
> lengths.
>       * total size of all segments will be at most MAX_RW_COUNT.
>       * file offset won't go from negative to zero in the combined area;
> unless the file has FMODE_UNSIGNED_OFFSET in ->f_mode, it won't go from
> positive to negative either.
> So what exactly does generic_segments_check() give us?  Is it just that
> everybody went "well, maybe there's some weird path where we don't do
> validation; let's leave it there"?  Linus?

I came to the same conclusion awhile ago - I'm pretty sure it can be
safely dropped (I think I even have such a patch in one of my

Anyways, copy_check_uvector() is the correct place for all that stuff
anyways - it's taking a __user type and producing a type without the
__user attribute, so if there was any validation missing there that's
where it should go.

I vaguelly recall converting some SCSI related code to use
copy_check_uvector() instead of its own (open coded?) thing, if that
patch made it upstream that could've been a place that at one point in
time did need the generic_segment_checks() call.

<Prev in Thread] Current Thread [Next in Thread>