On 06/25/2013 09:19 PM, Dave Chinner wrote:
> On Tue, Jun 25, 2013 at 05:17:39PM -0400, Brian Foster wrote:
>> Adds support for additional internal-only functionality to
>> eofblocks scans such as the scan_owner field. The scan owner
>> defines an optional inode number that is responsible for the
>> current scan. The purpose is to identify that an inode is under
>> iolock and as such, the iolock shouldn't be attempted when
>> trimming eof blocks.
>>
>> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
>> ---
>>
>> Hi Dwight,
>>
>> Here's that patch I referred to... as you said, it's not really that
>> complicated, so take it or leave it. I suspect you'd have to drop the
>> eof_scan_owner bits (that's for my use case that is not upstream yet), change
>> over the uid/gid types, move the defs from xfs_fs.h into xfs_icache.h (I seem
>> to have got this wrong as well), then pull the conversion up into the ioctl
>> code rather than xfs_icache.c. It might just be easier to modify what you
>> have
>> already...
>>
>> Brian
>>
>> fs/xfs/xfs_fs.h | 26 ++++++++++++++++++++++++++
>> fs/xfs/xfs_icache.c | 43 +++++++++++++++++++++++++++++++++++--------
>> 2 files changed, 61 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
>> index 32cf913..6cc294b 100644
>> --- a/fs/xfs/xfs_fs.h
>> +++ b/fs/xfs/xfs_fs.h
>> @@ -370,6 +370,32 @@ struct xfs_eofblocks {
>> XFS_EOF_FLAGS_MINFILESIZE | \
>> XFS_EOF_FLAGS_FLUSH)
>>
>> +struct xfs_eofblocks_internal {
>> + __u32 eof_version;
>> + __u32 eof_flags;
>> + uid_t eof_uid;
>> + gid_t eof_gid;
>> + prid_t eof_prid;
>> + __u64 eof_min_file_size;
>> + xfs_ino_t eof_scan_owner;
>> +};
>> +
>> +static inline void
>> +xfs_eofb_to_internal(
>> + struct xfs_eofblocks_internal *ei,
>> + struct xfs_eofblocks *e,
>> + xfs_ino_t ino)
>> +{
>> + ei->eof_version = e->eof_version;
>> + ei->eof_flags = e->eof_flags;
>> + ei->eof_uid = e->eof_uid;
>> + ei->eof_gid = e->eof_gid;
>> + ei->eof_prid = e->eof_prid;
>> + ei->eof_min_file_size = e->eof_min_file_size;
>> +
>> + ei->eof_scan_owner = ino;
>> +}
>
> This doesn't belong in xfs_fs.h. xfs_fs.h defines userspace access
> interfaces, not internal kernel-only structures.
>
Yeah, I sent this along out of my tree as an FYI for Dwight, knowing
that it wasn't quite ready yet. FWIW, I noted the incorrect header
above... ;)
> As it is, I really don't like the name xfs_eofblocks_internal.
> I'd prefer the userspace structure uses the prefix "xfs_fs_...." and the
> internal, kernel only one drops the "_fs" part of the name. i.e. the
> kernel code doesn't need to change at all, only the name of the
> external structure.
>
Ok, so you are suggesting we actually change the exported structure
name? I ask because I mentioned to Dwight previously that we didn't want
to go and change xfs_eofblocks to xfs_ueofblocks, considering it was
exported. Not that there are many users...
I'm Ok with the name change if there's no issue changing it after the
fact. xfs_fs_eofblocks for the exported structure and xfs_eofblocks for
the internal one sounds fine to me. (Dwight, sorry for the 180 on that).
>> @@ -1244,6 +1245,16 @@ xfs_inode_free_eofblocks(
>>
>> if (eofb->eof_flags & XFS_EOF_FLAGS_FLUSH)
>> filemap_flush(VFS_I(ip)->i_mapping);
>> +
>> + /*
>> + * A scan owner implies we already hold the iolock. Skip it in
>> + * xfs_free_eofblocks() to avoid deadlock. This also eliminates
>> + * the possibility of EAGAIN being returned.
>> + */
>> + if (eofb->eof_scan_owner != NULLFSINO &&
>> + eofb->eof_scan_owner == ip->i_ino)
>> + need_iolock = false;
>> +
>> }
>
> That's a separate fix, right? So a separate patch?
>
I noticed that as well. That should be pulled out separately. It most
likely will be now that the structure separation may occur for the
userns use case first.
> And, FWIW, by definition ip->i_ino is never NULLFSINO by the time it
> is inserted into the cache, and so the only check needed is
> if(eofb->eof_scan_owner == ip->i_ino)....
>
Ok, good point.
>
>>
>> /*
>> @@ -1254,7 +1265,7 @@ xfs_inode_free_eofblocks(
>> mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY))
>> return 0;
>>
>> - ret = xfs_free_eofblocks(ip->i_mount, ip, true);
>> + ret = xfs_free_eofblocks(ip->i_mount, ip, need_iolock);
>>
>> /* don't revisit the inode if we're not waiting */
>> if (ret == EAGAIN && !(flags & SYNC_WAIT))
>> @@ -1263,10 +1274,10 @@ xfs_inode_free_eofblocks(
>> return ret;
>> }
>>
>> -int
>> -xfs_icache_free_eofblocks(
>> - struct xfs_mount *mp,
>> - struct xfs_eofblocks *eofb)
>> +STATIC int
>> +__xfs_icache_free_eofblocks(
>> + struct xfs_mount *mp,
>> + struct xfs_eofblocks_internal *eofb)
>> {
>> int flags = SYNC_TRYLOCK;
>>
>> @@ -1277,6 +1288,22 @@ xfs_icache_free_eofblocks(
>> eofb, XFS_ICI_EOFBLOCKS_TAG);
>> }
>>
>> +int
>> +xfs_icache_free_eofblocks(
>> + struct xfs_mount *mp,
>> + struct xfs_eofblocks *eofb)
>> +{
>> + struct xfs_eofblocks_internal eofb_i;
>> + struct xfs_eofblocks_internal *eofb_p = NULL;
>> +
>> + if (eofb) {
>> + xfs_eofb_to_internal(&eofb_i, eofb, NULLFSINO);
>> + eofb_p = &eofb_i;
>> + }
>> +
>> + return __xfs_icache_free_eofblocks(mp, eofb_p);
>> +}
>
> This conversion should be done in the xfs_ioctl.c - that's the only
> place where the "external" version of the structure is used, and
> that's where we convert to "internal" kernel only structures for
> calls into the kernel core code. All kernel internal users shoul dbe
> using the kernel-only structure definition directly....
>
This is the approach Dwight is going to take (re: the "userns: Convert
xfs to use kuid/kgid where appropriate" thread), I believe. Thanks Dave.
Brian
> Cheers,
>
> Dave.
>
|