xfs
[Top] [All Lists]

Re: [PATCH 10/10] xfs: don't cache inodes read through bulkstat

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 10/10] xfs: don't cache inodes read through bulkstat
From: Ben Myers <bpm@xxxxxxx>
Date: Wed, 14 Mar 2012 15:44:01 -0500
Cc: xfs@xxxxxxxxxxx
In-reply-to: <1331095828-28742-11-git-send-email-david@xxxxxxxxxxxxx>
References: <1331095828-28742-1-git-send-email-david@xxxxxxxxxxxxx> <1331095828-28742-11-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.18 (2008-05-17)
On Wed, Mar 07, 2012 at 03:50:28PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> When we read inodes via bulkstat, we generally only read them once
> and then throw them away - they never get used again. If we retain
> them in cache, then it simply causes the working set of inodes and
> other cached items to be reclaimed just so the inode cache can grow.
> 
> Avoid this problem by marking inodes read by bulkstat as not to be
> cached and check this flag in .drop_inode to determine whether the
> inode should be added to the VFS LRU or not. If the inode lookup
> hits an already cached inode, then don't set the flag. If the inode
> lookup hits an inode marked with no cache flag, remove the flag and
> allow it to be cached once the current reference goes away.
> 
> Inodes marked as not cached will get cleaned up by the background
> inode reclaim or via memory pressure, so they will still generate
> some short term cache pressure. They will, however, be reclaimed
> much sooner and in preference to cache hot inodes.
> 
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>

Looks good.

Reviewed-by: Ben Myers <bpm@xxxxxxx>

> ---
>  fs/xfs/xfs_iget.c   |    8 ++++++--
>  fs/xfs/xfs_inode.h  |    4 +++-
>  fs/xfs/xfs_itable.c |    3 ++-
>  fs/xfs/xfs_super.c  |   17 +++++++++++++++++
>  4 files changed, 28 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> index 93fc1dc..20ddb1e 100644
> --- a/fs/xfs/xfs_iget.c
> +++ b/fs/xfs/xfs_iget.c
> @@ -290,7 +290,7 @@ xfs_iget_cache_hit(
>       if (lock_flags != 0)
>               xfs_ilock(ip, lock_flags);
>  
> -     xfs_iflags_clear(ip, XFS_ISTALE);
> +     xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
>       XFS_STATS_INC(xs_ig_found);
>  
>       return 0;
> @@ -315,6 +315,7 @@ xfs_iget_cache_miss(
>       struct xfs_inode        *ip;
>       int                     error;
>       xfs_agino_t             agino = XFS_INO_TO_AGINO(mp, ino);
> +     int                     iflags;
>  
>       ip = xfs_inode_alloc(mp, ino);
>       if (!ip)
> @@ -359,8 +360,11 @@ xfs_iget_cache_miss(
>        * memory barrier that ensures this detection works correctly at lookup
>        * time.
>        */
> +     iflags = XFS_INEW;
> +     if (flags & XFS_IGET_DONTCACHE)
> +             iflags |= XFS_IDONTCACHE;
>       ip->i_udquot = ip->i_gdquot = NULL;
> -     xfs_iflags_set(ip, XFS_INEW);
> +     xfs_iflags_set(ip, iflags);
>  
>       /* insert the new inode */
>       spin_lock(&pag->pag_ici_lock);
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index eda4937..096b887 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -374,10 +374,11 @@ xfs_set_projid(struct xfs_inode *ip,
>  #define XFS_IFLOCK           (1 << __XFS_IFLOCK_BIT)
>  #define __XFS_IPINNED_BIT    8        /* wakeup key for zero pin count */
>  #define XFS_IPINNED          (1 << __XFS_IPINNED_BIT)
> +#define XFS_IDONTCACHE               (1 << 9) /* don't cache the inode long 
> term */
>  
>  /*
>   * Per-lifetime flags need to be reset when re-using a reclaimable inode 
> during
> - * inode lookup. Thi prevents unintended behaviour on the new inode from
> + * inode lookup. This prevents unintended behaviour on the new inode from
>   * ocurring.
>   */
>  #define XFS_IRECLAIM_RESET_FLAGS     \
> @@ -544,6 +545,7 @@ do { \
>   */
>  #define XFS_IGET_CREATE              0x1
>  #define XFS_IGET_UNTRUSTED   0x2
> +#define XFS_IGET_DONTCACHE   0x4
>  
>  int          xfs_inotobp(struct xfs_mount *, struct xfs_trans *,
>                           xfs_ino_t, struct xfs_dinode **,
> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> index 751e94f..b832c58 100644
> --- a/fs/xfs/xfs_itable.c
> +++ b/fs/xfs/xfs_itable.c
> @@ -76,7 +76,8 @@ xfs_bulkstat_one_int(
>               return XFS_ERROR(ENOMEM);
>  
>       error = xfs_iget(mp, NULL, ino,
> -                      XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip);
> +                      (XFS_IGET_DONTCACHE | XFS_IGET_UNTRUSTED),
> +                      XFS_ILOCK_SHARED, &ip);
>       if (error) {
>               *stat = BULKSTAT_RV_NOTHING;
>               goto out_free;
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index b1df512..c162765 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -953,6 +953,22 @@ xfs_fs_evict_inode(
>       xfs_inactive(ip);
>  }
>  
> +/*
> + * We do an unlocked check for XFS_IDONTCACHE here because we are already
> + * serialised against cache hits here via the inode->i_lock and igrab() in
> + * xfs_iget_cache_hit(). Hence a lookup that might clear this flag will not 
> be
> + * racing with us, and it avoids needing to grab a spinlock here for every 
> inode
> + * we drop the final reference on.
> + */

I'll try to put this in my own words, just in case it is mystifying for
anyone else.  ;)

In this case it is ok to do check of ip->i_flags without holding
inode->i_flags_lock because... we have exclusion from xfs_iget_cache_hit
as follows:

The 'dropper' would have taken inode->i_lock when the inode's count went
to zero, and if the XFS_IDONTCARE flag is set, dropper will return 1 to
iput_final which will result in iput_final skipping the inode lru and
setting I_FREEING immediately, before droppig inode->i_lock and evicting
the inode.

A 'cache hitter' must call igrab in order to get a reference on the
inode.  igrab takes the inode->i_lock, and if I_FREEING is set, it
returns NULL, then xfs_iget_cache_hit returns EAGAIN, and is restarted.

So... any 'cache hitter' who could possibly clear the XFS_IDONTCACHE
flag subsequent to 'dropper' checking it would always be unable to get a
reference due to I_FREEING having been set by the dropper.

I appreciate that you added the comment.

Regards,
        Ben

> +STATIC int
> +xfs_fs_drop_inode(
> +     struct inode            *inode)
> +{
> +     struct xfs_inode        *ip = XFS_I(inode);
> +
> +     return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
> +}
> +
>  STATIC void
>  xfs_free_fsname(
>       struct xfs_mount        *mp)
> @@ -1431,6 +1447,7 @@ static const struct super_operations 
> xfs_super_operations = {
>       .dirty_inode            = xfs_fs_dirty_inode,
>       .write_inode            = xfs_fs_write_inode,
>       .evict_inode            = xfs_fs_evict_inode,
> +     .drop_inode             = xfs_fs_drop_inode,
>       .put_super              = xfs_fs_put_super,
>       .sync_fs                = xfs_fs_sync_fs,
>       .freeze_fs              = xfs_fs_freeze,
> -- 
> 1.7.9
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>