xfs
[Top] [All Lists]

Re: [PATCH] userns: Convert xfs to use kuid/kgid where appropriate

To: Dwight Engen <dwight.engen@xxxxxxxxxx>
Subject: Re: [PATCH] userns: Convert xfs to use kuid/kgid where appropriate
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Thu, 20 Jun 2013 11:27:04 -0400
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, "Eric W. Biederman" <ebiederm@xxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130620095410.1917d235@xxxxxxxxxx>
References: <20130619110948.0bfafa2b@xxxxxxxxxx> <20130620001341.GM29338@dastard> <20130620095410.1917d235@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6
On 06/20/2013 09:54 AM, Dwight Engen wrote:
> On Thu, 20 Jun 2013 10:13:41 +1000
> Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
>> On Wed, Jun 19, 2013 at 11:09:48AM -0400, Dwight Engen wrote:
>>> Use uint32 from init_user_ns for xfs internal uid/gid
>>> representation in acl, xfs_icdinode. Conversion of kuid/gid is done
>>> at the vfs boundary, other user visible xfs specific interfaces
>>> (bulkstat, eofblocks filter) expect uint32 init_user_ns uid/gid
>>> values.
>>
>> It's minimal, but I'm not sure it's complete. I'll comment on that
>> in response to Eric's comments...
>>
>>> Signed-off-by: Dwight Engen <dwight.engen@xxxxxxxxxx>
>> ....
>>> --- a/fs/xfs/xfs_fs.h
>>> +++ b/fs/xfs/xfs_fs.h
>>> @@ -347,8 +347,8 @@ typedef struct xfs_error_injection {
>>>  struct xfs_eofblocks {
>>>     __u32           eof_version;
>>>     __u32           eof_flags;
>>> -   uid_t           eof_uid;
>>> -   gid_t           eof_gid;
>>> +   __u32           eof_uid;
>>> +   __u32           eof_gid;
>>>     prid_t          eof_prid;
>>>     __u32           pad32;
>>>     __u64           eof_min_file_size;
>>
>> The patch doesn't do namespace conversion of these uid/gids, but I'm
>> not sure it should...
> 
> The ids are only advisory, if the caller doesn't specify
> XFS_EOF_FLAGS_?ID blocks from any inode in the fs can be reclaimed
> regardless of id. Because of this, I think at a minimum we should
> change XFS_IOC_FREE_EOFBLOCKS to require capable(CAP_SYS_ADMIN), which
> somewhat implies init_user_ns based ids.
> 
> To make this really userns aware, I think we could:
>   - leave the fields as uid_t
>   - make XFS_IOC_FREE_EOFBLOCKS require nsown_capable(CAP_SYS_ADMIN)
>   - check kuid_has_mapping(current_user_ns()) for each
>     inode. This would be a change in behavior when called
>     from !init_user_ns, limiting the scope of inodes the ioctl could
>     affect.
>   - change xfs_inode_match_id() to use uid_eq(VFS_I(ip)->i_uid,
>     make_kuid(current_user_ns(), eofb->eof_uid))
> 
> Does that sound reasonable?
> 

Hi Dwight,

If I understand correctly, the proposition is to turn
XFS_EOF_FREE_EOFBLOCKS into administrator only functionality and run ns
conversions on the inode uid/gid and associated eofb values for the ID
filtering functionality.

The latter sounds reasonable to me, though I'm not so sure about the
CAP_SYS_ADMIN bit. For example, I think we'd expect a regular user to be
able to run an eofblocks scan against files covered under his quota.

Perhaps the right thing to do here is to restrict global (and project
quota?) scans to CAP_SYS_ADMIN and uid/gid based scans to processes with
the appropriate permissions (i.e., CAP_SYS_ADMIN, matching uid/gid or
CAP_FOWNER). Thoughts?

Brian

>>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>>> index 96e344e..70ba410 100644
>>> --- a/fs/xfs/xfs_icache.c
>>> +++ b/fs/xfs/xfs_icache.c
>>> @@ -617,7 +617,7 @@ restart:
>>>  
>>>  /*
>>>   * Background scanning to trim post-EOF preallocated space. This
>>> is queued
>>> - * based on the 'background_prealloc_discard_period' tunable (5m
>>> by default).
>>> + * based on the 'speculative_prealloc_lifetime' tunable (5m by
>>> default). */
>>>  STATIC void
>>>  xfs_queue_eofblocks(
>>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>>> index 7f7be5f..8049976 100644
>>> --- a/fs/xfs/xfs_inode.c
>>> +++ b/fs/xfs/xfs_inode.c
>>> @@ -1268,8 +1268,8 @@ xfs_ialloc(
>>>     ip->i_d.di_onlink = 0;
>>>     ip->i_d.di_nlink = nlink;
>>>     ASSERT(ip->i_d.di_nlink == nlink);
>>> -   ip->i_d.di_uid = current_fsuid();
>>> -   ip->i_d.di_gid = current_fsgid();
>>> +   ip->i_d.di_uid = from_kuid(&init_user_ns, current_fsuid());
>>> +   ip->i_d.di_gid = from_kgid(&init_user_ns, current_fsgid());
>>
>> Why all new inodes be created in the init_user_ns? Shouldn't this be
>> current_user_ns()?
> 
> current_fsuid() is the kuid_t from whatever userns current is in,
> which we are converting to a flat uint32 since i_d is the on disk inode.
> This field is then used in xfs_setup_inode() to populate
> VFS_I(ip)->i_uid. Most other filesystems would use inode_init_owner(),
> but xfs does not (I assume because it wants to handle SGID itself based
> on XFS_INHERIT_GID and irix_sgid_inherit).
> 
>> Same question throughout - why do you use init_user_ns for all these
>> UID conversions, when the whole point is to have awareness of
>> different namespaces?
> 
> Yep, there are instances you point out below where we can just use
> inode->i_uid instead of converting back from the flat value, so I'll fix
> those up.
> 
>>>     xfs_set_projid(ip, prid);
>>>     memset(&(ip->i_d.di_pad[0]), 0, sizeof(ip->i_d.di_pad));
>>>  
>>> @@ -1308,7 +1308,7 @@ xfs_ialloc(
>>>      */
>>>     if ((irix_sgid_inherit) &&
>>>         (ip->i_d.di_mode & S_ISGID) &&
>>> -       (!in_group_p((gid_t)ip->i_d.di_gid))) {
>>> +       (!in_group_p(make_kgid(&init_user_ns,
>>> ip->i_d.di_gid)))) { ip->i_d.di_mode &= ~S_ISGID;
>>>     }
>>>  
>>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>>> index 5e99968..daa6127 100644
>>> --- a/fs/xfs/xfs_ioctl.c
>>> +++ b/fs/xfs/xfs_ioctl.c
>>> @@ -981,7 +981,7 @@ xfs_ioctl_setattr(
>>>      * to the file owner ID, except in cases where the
>>>      * CAP_FSETID capability is applicable.
>>>      */
>>> -   if (current_fsuid() != ip->i_d.di_uid
>>> && !capable(CAP_FOWNER)) {
>>> +   if (!inode_owner_or_capable(&ip->i_vnode)) {
>>
>>                                  VFS_I(ip)
>>
>>>             code = XFS_ERROR(EPERM);
>>>             goto error_return;
>>>     }
>>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>>> index ca9ecaa..bf96cf8 100644
>>> --- a/fs/xfs/xfs_iops.c
>>> +++ b/fs/xfs/xfs_iops.c
>>> @@ -420,8 +420,8 @@ xfs_vn_getattr(
>>>     stat->dev = inode->i_sb->s_dev;
>>>     stat->mode = ip->i_d.di_mode;
>>>     stat->nlink = ip->i_d.di_nlink;
>>> -   stat->uid = ip->i_d.di_uid;
>>> -   stat->gid = ip->i_d.di_gid;
>>> +   stat->uid = make_kuid(&init_user_ns, ip->i_d.di_uid);
>>> +   stat->gid = make_kgid(&init_user_ns, ip->i_d.di_gid);
>>
>> Why not:
>>
>>      stat->uid = inode->i_uid;
>>      stat->gid = inode->i_gid;
>>
>>>     stat->ino = ip->i_ino;
>>>     stat->atime = inode->i_atime;
>>>     stat->mtime = inode->i_mtime;
>>> @@ -488,8 +488,8 @@ xfs_setattr_nonsize(
>>>     int                     mask = iattr->ia_valid;
>>>     xfs_trans_t             *tp;
>>>     int                     error;
>>> -   uid_t                   uid = 0, iuid = 0;
>>> -   gid_t                   gid = 0, igid = 0;
>>> +   kuid_t                  uid = GLOBAL_ROOT_UID, iuid
>>> = GLOBAL_ROOT_UID;
>>> +   kgid_t                  gid = GLOBAL_ROOT_GID, igid
>>> = GLOBAL_ROOT_GID; struct xfs_dquot *udqp = NULL, *gdqp =
>>> NULL; struct xfs_dquot      *olddquot1 = NULL, *olddquot2 = NULL;
>>>  
>>> @@ -522,13 +522,13 @@ xfs_setattr_nonsize(
>>>                     uid = iattr->ia_uid;
>>>                     qflags |= XFS_QMOPT_UQUOTA;
>>>             } else {
>>> -                   uid = ip->i_d.di_uid;
>>> +                   uid = make_kuid(&init_user_ns,
>>> ip->i_d.di_uid);
>>
>>                      uid = VFS_I(ip)->i_uid;
>>>             }
>>>             if ((mask & ATTR_GID) && XFS_IS_GQUOTA_ON(mp)) {
>>>                     gid = iattr->ia_gid;
>>>                     qflags |= XFS_QMOPT_GQUOTA;
>>>             }  else {
>>> -                   gid = ip->i_d.di_gid;
>>> +                   gid = make_kgid(&init_user_ns,
>>> ip->i_d.di_gid);
>>
>>                      gid = VFS_I(ip)->i_gid;
>>>             }
>> .....
>>> @@ -561,8 +563,8 @@ xfs_setattr_nonsize(
>>>              * while we didn't have the inode locked, inode's
>>> dquot(s)
>>>              * would have changed also.
>>>              */
>>> -           iuid = ip->i_d.di_uid;
>>> -           igid = ip->i_d.di_gid;
>>> +           iuid = make_kuid(&init_user_ns, ip->i_d.di_uid);
>>> +           igid = make_kgid(&init_user_ns, ip->i_d.di_gid);
>>>             gid = (mask & ATTR_GID) ? iattr->ia_gid : igid;
>>>             uid = (mask & ATTR_UID) ? iattr->ia_uid : iuid;
>>
>> Same again - you can just use VFS_I(ip)->i_uid/VFS_I(ip)->i_gid
>> here.
>>
>>> @@ -1172,8 +1174,8 @@ xfs_setup_inode(
>>>  
>>>     inode->i_mode   = ip->i_d.di_mode;
>>>     set_nlink(inode, ip->i_d.di_nlink);
>>> -   inode->i_uid    = ip->i_d.di_uid;
>>> -   inode->i_gid    = ip->i_d.di_gid;
>>> +   inode->i_uid    = make_kuid(&init_user_ns,
>>> ip->i_d.di_uid);
>>> +   inode->i_gid    = make_kgid(&init_user_ns,
>>> ip->i_d.di_gid);
>>
>> current name space?
> 
> I believe that is what this is doing, but I think it will be more
> proper to do it the same as other filesystems:
> 
> i_uid_write(inode, ip->i_d.di_uid);
> i_gid_write(inode, ip->i_d.di_gid);
> 
>>>  
>>>     switch (inode->i_mode & S_IFMT) {
>>>     case S_IFBLK:
>>> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
>>> index b75c9bb..94a2a8f 100644
>>> --- a/fs/xfs/xfs_qm.c
>>> +++ b/fs/xfs/xfs_qm.c
>>> @@ -1651,8 +1651,8 @@ xfs_qm_write_sb_changes(
>>>  int
>>>  xfs_qm_vop_dqalloc(
>>>     struct xfs_inode        *ip,
>>> -   uid_t                   uid,
>>> -   gid_t                   gid,
>>> +   __uint32_t              di_uid,
>>> +   __uint32_t              di_gid,
>>
>> xfs_dqid_t
>>
>> And there's no need to rename the variables - that just causes
>> unnecessary churn, and the fact it is a dquot ID is documented by
>> the type.
> 
> Yep using the type will be clearer, thanks.
> 
>> Cheers,
>>
>> Dave.
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

<Prev in Thread] Current Thread [Next in Thread>