On Wed, Mar 07, 2012 at 03:50:28PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
>
> When we read inodes via bulkstat, we generally only read them once
> and then throw them away - they never get used again. If we retain
> them in cache, then it simply causes the working set of inodes and
> other cached items to be reclaimed just so the inode cache can grow.
>
> Avoid this problem by marking inodes read by bulkstat as not to be
> cached and check this flag in .drop_inode to determine whether the
> inode should be added to the VFS LRU or not. If the inode lookup
> hits an already cached inode, then don't set the flag. If the inode
> lookup hits an inode marked with no cache flag, remove the flag and
> allow it to be cached once the current reference goes away.
>
> Inodes marked as not cached will get cleaned up by the background
> inode reclaim or via memory pressure, so they will still generate
> some short term cache pressure. They will, however, be reclaimed
> much sooner and in preference to cache hot inodes.
>
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
Looks good.
Reviewed-by: Ben Myers <bpm@xxxxxxx>
> ---
> fs/xfs/xfs_iget.c | 8 ++++++--
> fs/xfs/xfs_inode.h | 4 +++-
> fs/xfs/xfs_itable.c | 3 ++-
> fs/xfs/xfs_super.c | 17 +++++++++++++++++
> 4 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> index 93fc1dc..20ddb1e 100644
> --- a/fs/xfs/xfs_iget.c
> +++ b/fs/xfs/xfs_iget.c
> @@ -290,7 +290,7 @@ xfs_iget_cache_hit(
> if (lock_flags != 0)
> xfs_ilock(ip, lock_flags);
>
> - xfs_iflags_clear(ip, XFS_ISTALE);
> + xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
> XFS_STATS_INC(xs_ig_found);
>
> return 0;
> @@ -315,6 +315,7 @@ xfs_iget_cache_miss(
> struct xfs_inode *ip;
> int error;
> xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ino);
> + int iflags;
>
> ip = xfs_inode_alloc(mp, ino);
> if (!ip)
> @@ -359,8 +360,11 @@ xfs_iget_cache_miss(
> * memory barrier that ensures this detection works correctly at lookup
> * time.
> */
> + iflags = XFS_INEW;
> + if (flags & XFS_IGET_DONTCACHE)
> + iflags |= XFS_IDONTCACHE;
> ip->i_udquot = ip->i_gdquot = NULL;
> - xfs_iflags_set(ip, XFS_INEW);
> + xfs_iflags_set(ip, iflags);
>
> /* insert the new inode */
> spin_lock(&pag->pag_ici_lock);
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index eda4937..096b887 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -374,10 +374,11 @@ xfs_set_projid(struct xfs_inode *ip,
> #define XFS_IFLOCK (1 << __XFS_IFLOCK_BIT)
> #define __XFS_IPINNED_BIT 8 /* wakeup key for zero pin count */
> #define XFS_IPINNED (1 << __XFS_IPINNED_BIT)
> +#define XFS_IDONTCACHE (1 << 9) /* don't cache the inode long
> term */
>
> /*
> * Per-lifetime flags need to be reset when re-using a reclaimable inode
> during
> - * inode lookup. Thi prevents unintended behaviour on the new inode from
> + * inode lookup. This prevents unintended behaviour on the new inode from
> * ocurring.
> */
> #define XFS_IRECLAIM_RESET_FLAGS \
> @@ -544,6 +545,7 @@ do { \
> */
> #define XFS_IGET_CREATE 0x1
> #define XFS_IGET_UNTRUSTED 0x2
> +#define XFS_IGET_DONTCACHE 0x4
>
> int xfs_inotobp(struct xfs_mount *, struct xfs_trans *,
> xfs_ino_t, struct xfs_dinode **,
> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> index 751e94f..b832c58 100644
> --- a/fs/xfs/xfs_itable.c
> +++ b/fs/xfs/xfs_itable.c
> @@ -76,7 +76,8 @@ xfs_bulkstat_one_int(
> return XFS_ERROR(ENOMEM);
>
> error = xfs_iget(mp, NULL, ino,
> - XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip);
> + (XFS_IGET_DONTCACHE | XFS_IGET_UNTRUSTED),
> + XFS_ILOCK_SHARED, &ip);
> if (error) {
> *stat = BULKSTAT_RV_NOTHING;
> goto out_free;
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index b1df512..c162765 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -953,6 +953,22 @@ xfs_fs_evict_inode(
> xfs_inactive(ip);
> }
>
> +/*
> + * We do an unlocked check for XFS_IDONTCACHE here because we are already
> + * serialised against cache hits here via the inode->i_lock and igrab() in
> + * xfs_iget_cache_hit(). Hence a lookup that might clear this flag will not
> be
> + * racing with us, and it avoids needing to grab a spinlock here for every
> inode
> + * we drop the final reference on.
> + */
I'll try to put this in my own words, just in case it is mystifying for
anyone else. ;)
In this case it is ok to do check of ip->i_flags without holding
inode->i_flags_lock because... we have exclusion from xfs_iget_cache_hit
as follows:
The 'dropper' would have taken inode->i_lock when the inode's count went
to zero, and if the XFS_IDONTCARE flag is set, dropper will return 1 to
iput_final which will result in iput_final skipping the inode lru and
setting I_FREEING immediately, before droppig inode->i_lock and evicting
the inode.
A 'cache hitter' must call igrab in order to get a reference on the
inode. igrab takes the inode->i_lock, and if I_FREEING is set, it
returns NULL, then xfs_iget_cache_hit returns EAGAIN, and is restarted.
So... any 'cache hitter' who could possibly clear the XFS_IDONTCACHE
flag subsequent to 'dropper' checking it would always be unable to get a
reference due to I_FREEING having been set by the dropper.
I appreciate that you added the comment.
Regards,
Ben
> +STATIC int
> +xfs_fs_drop_inode(
> + struct inode *inode)
> +{
> + struct xfs_inode *ip = XFS_I(inode);
> +
> + return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
> +}
> +
> STATIC void
> xfs_free_fsname(
> struct xfs_mount *mp)
> @@ -1431,6 +1447,7 @@ static const struct super_operations
> xfs_super_operations = {
> .dirty_inode = xfs_fs_dirty_inode,
> .write_inode = xfs_fs_write_inode,
> .evict_inode = xfs_fs_evict_inode,
> + .drop_inode = xfs_fs_drop_inode,
> .put_super = xfs_fs_put_super,
> .sync_fs = xfs_fs_sync_fs,
> .freeze_fs = xfs_fs_freeze,
> --
> 1.7.9
>
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
|