[Top] [All Lists]

Re: [PATCH 6/6] ioctl eofblocks: require non-privileged users to specify

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: [PATCH 6/6] ioctl eofblocks: require non-privileged users to specify uid/gid match
From: Dwight Engen <dwight.engen@xxxxxxxxxx>
Date: Fri, 28 Jun 2013 19:22:38 -0400
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, "Eric W. Biederman" <ebiederm@xxxxxxxxx>, xfs@xxxxxxxxxxx, Serge Hallyn <serge.hallyn@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <51CE0281.4040403@xxxxxxxxxx>
Organization: Oracle Corporation
References: <20130619110948.0bfafa2b@xxxxxxxxxx> <20130620001341.GM29338@dastard> <20130620095410.1917d235@xxxxxxxxxx> <20130620220311.GT29376@dastard> <20130621111420.5592707e@xxxxxxxxxx> <20130624003316.GH29376@dastard> <20130624091035.6274800f@xxxxxxxxxx> <20130626020924.GD29376@dastard> <20130628111138.68d0b486@xxxxxxxxxx> <51CDDAF0.7060501@xxxxxxxxxx> <20130628162825.4fb7e4ed@xxxxxxxxxx> <51CE0281.4040403@xxxxxxxxxx>
On Fri, 28 Jun 2013 17:39:13 -0400
Brian Foster <bfoster@xxxxxxxxxx> wrote:

> On 06/28/2013 04:28 PM, Dwight Engen wrote:
> > On Fri, 28 Jun 2013 14:50:24 -0400
> > Brian Foster <bfoster@xxxxxxxxxx> wrote:
> > 
> >> On 06/28/2013 11:11 AM, Dwight Engen wrote:
> >>> Signed-off-by: Dwight Engen <dwight.engen@xxxxxxxxxx>
> >>> ---
> ...
> >>
> >> In thinking more about the other aspect of group management (and I
> >> admit I'm still waffling about this), it seems like we could go in
> >> a couple directions:
> >>
> >> - Now that we have a separate internal only eofblocks control, be
> >> more flexible and provide an internal only flag (valid only for
> >> UID/GID scans) to instruct the scan to do specific file permission
> >> checking against the inodes. This would be set by xfs_file_ioctl()
> >> and do the write permission enforcement for userspace originated
> >> scans. This would also allow the future EDQUOT work to leave out
> >> said flag and do a group wide scan regardless of the specific
> >> permissions of the calling context (i.e., when the system decides
> >> all inodes under a group quota must be trimmed).
> > 
> > I haven't seen your EDQUOT change, but your description made me
> > wonder: Are you going to kick off a scan for the type of quota
> > exhausted? Otherwise could a user abuse this by overrunning his
> > user quota in order to cause group inodes (that he may not have
> > write to) to be reclaimed?
> > 
> Yes, you could describe it that way. The current behavior is a global
> inode flush followed by an ENOSPC/EDQUOT error, so I'm not sure we're
> exposing much by attempting an eofblocks scan before running out of
> space. It's a fairly minor optimization in the failure path.
> Another way to look at it might be that the users/inodes have never
> really reserved the extra space that's being trimmed here, so the
> filesystem has every right to take it away if it deems it better to
> put to use elsewhere (i.e., to put off an ENOSPC related failure).
> > At any rate, yeah the only way I see to get the permissions checks
> > right is to set a flag up in ioctl because the checks need to be per
> > inode. I think you would like to avoid having the checks that low in
> > xfs_icache, but I don't see a way around that.
> > 
> That's the reasoning behind the extra internal only flag. It's not
> having the check that I'm against so much as the idea that this layer
> of code should be unconditionally bound by the current userspace
> context. But it might be reasonable to add control flags to
> effectively make that so (i.e., XFS_EOF_FLAGS_ENFORCE_PERMS).

Right, clearly you've got a use case where it shouldn't be limited
while the ioctl caller could ensure the flag is on.

> >> The downsides here are the behavior might be a bit unclear and we'd
> >> have to fork off the flags bits in a manner that's clear and
> >> maintainable. I suppose one could also argue this is overkill for
> >> somewhat of a secondary operation.
> >>
> >> - Go the other direction, be less flexible and simply not allow
> >> !capable(CAP_SYS_ADMIN) group scans just as we started doing for
> >> project quotas. This is obviously very simple, but then we disallow
> >> regular users from trimming groups of inodes they should full well
> >> have permission to trim.
> > 
> > What about uid == self scans? Would we not allow those as well? The
> > check may be simpler than a group check but still would need to be
> > per inode, and thus still need the flags of option 1.
> > 
> I'm not following. Aren't we already enforcing this appropriately with
> your current patch? My comments were intended to be taken as in
> addition to this patch. i.e., with regard to your comments in the
> commit log here:
> http://oss.sgi.com/archives/xfs/2013-06/msg00785.html

Ahh yes, the code there should be fine since it requires
XFS_EOF_FLAGS_UID and that uid == self to be given. I misunderstood
your "less flexible" above to be in-lieu of the checks in that patch and
it was only mentioning group scans so that is why I was wondering
if you meant to throw out uid scans as well.

> >> I think I'm leaning towards the former approach if it can be
> >> implemented cleanly. Thoughts or ideas?
> > 
> > Well the former is certainly more functional and allows the
> > trimming to be under the users control. How useful is that though
> > once it happens automatically with your EDQUOT changes? If we only
> > allowed the ioctl to trigger global scans, we wouldn't need a
> > conversion function nor separate structure, but I don't know the
> > use cases for uid/gid specific scans from userspace so its hard for
> > me to judge the tradeoffs.
> > 
> To be honest, there aren't any real users of the eofblocks command
> from userspace that I'm aware of at the moment. I added it originally
> for a poc quota implementation I was working on for gluster, but the
> primary use case for the scanning mechanism is to allow background
> clean up of files such that post-eof speculative preallocation
> doesn't hang around for too long.

... and there are likely to be scenarios where waiting for the timer
would be too long?
> > Maybe this permission stuff should be a separate change since it
> > isn't really related to user namespace stuff? I just happened to be
> > in the vicinity and am happy to help :)
> > 
> Sounds reasonable to me. :)

If you want me to code up either option, let me know.
> Brian

<Prev in Thread] Current Thread [Next in Thread>