xfs
[Top] [All Lists]

Re: [PATCH] libxfs: stop caching inode structures

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH] libxfs: stop caching inode structures
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 15 Oct 2013 07:16:59 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20131009130241.GA8754@xxxxxxxxxxxxx>
References: <20131009130241.GA8754@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Oct 09, 2013 at 06:02:41AM -0700, Christoph Hellwig wrote:
> Currently libxfs has a cache for xfs_inode structures.  Unlike in kernelspace
> where the inode cache, and the associated page cache for file data is used
> for all filesystem operations the libxfs inode cache is only used in few
> places:
> 
>  - the libxfs init code reads the root and realtime inodes when called from
>    xfs_db using a special flag, but these inode structure are never referenced
>    again
>  - mkfs uses namespace and bmap routines that take the xfs_inode structure
>    to create the root and realtime inodes, as well as any additional files
>    specified in the proto file
>  - the xfs_db attr code uses xfs_inode-based attr routines in the attrset
>    and attrget commands
>  - phase6 of xfs_repair uses xfs_inode-based routines for rebuilding
>    directories and moving files to the lost+found directory.
>  - phase7 of xfs_repair uses struct xfs_inode to modify the nlink count
>    of inodes.
> 
> So except in repair we never ever reuse a cached inode, and even in repair
> the logical inode caching doesn't help:
> 
>  - in phase 6a we iterate over each inode in the incore inode tree,
>    and if it's a directory check/rebuild it
>  - phase6b then updates the "." and ".." entries for directories
>    that need, which means we require the backing buffers.
>  - phase6c moves disconnected inodes to lost_found, which again needs
>    the backing buffer to actually do anything.
>  - phase7 then only touches inodes for which we need to reset i_nlink,
>    which always involves reading, modifying and writing the physical
>    inode.
>    which always involves modifying the . and .. entries.
> 
> Given these facts stop caching the inodes to reduce memory usage
> especially in xfs_repair, where this makes a different for large inode
> count inodes.  On the upper end this allows repair to complete for
> filesystem / amount of memory combinations that previously wouldn't.

This all sounds good and the code looks fine, but there's one
lingering question I have - what's the impact on performance for
repair? Does it slow down phase 6/7 at all?

> With this we probably could increase the memory available to the buffer
> cache in xfs_repair, but trying to do so I got a bit lost - the current
> formula seems to magic to me to make any sense, and simply doubling the
> buffer cache size causes us to run out of memory given that the data cached
> in the buffer cache (typically lots of 8k inode buffers and few 4k other
> metadata buffers) are much bigger than the inodes cached in the inode
> cache.  We probably need a sizing scheme that takes the actual amount
> of memory allocated to the buffer cache into account to solve this better.

IIRC, the size of the buffer cache is currently set at 75% of RAM
so doubling it would cause OOM issues regardless of the presence of
the inode cache....

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>