[Top] [All Lists]

Re: Improving XFS file system inode performance

To: Jesse Stroik <jstroik@xxxxxxxxxxxxx>
Subject: Re: Improving XFS file system inode performance
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 23 Nov 2010 10:44:19 +1100
Cc: Emmanuel Florac <eflorac@xxxxxxxxxxxxxx>, Linux XFS <xfs@xxxxxxxxxxx>
In-reply-to: <4CEAEF66.7030708@xxxxxxxxxxxxx>
References: <4CEAE7D7.6050401@xxxxxxxxxxxxx> <20101122232528.21b78a9e@xxxxxxxxxxxxxx> <4CEAEF66.7030708@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Mon, Nov 22, 2010 at 04:32:06PM -0600, Jesse Stroik wrote:
> On 11/22/2010 04:25 PM, Emmanuel Florac wrote:
> >Le Mon, 22 Nov 2010 15:59:51 -0600 vous écriviez:
> >
> >>Performance was fine before the file system was filled -- last week
> >>~8TB showed up and filled the 20TB file system.  Since, it has been
> >>performing poorly.
> >
> >Maybe it got fragmented? How does fragmentation look like?
> I wasn't able to resolve this in reasonable time.  Part of the issue
> is that we're dealing with files within about 100k directories.
> I'll attempt to get the fragmentation numbers overnight.
> I suspect the regularly listed set of files on this fs exceeds the
> inode cache.  Where can I determine the cache misses and tune the
> file system?

Yup, that would be my guess, too.

You can use slabtop to find out how many inodes are cached and the
memory they use, and /proc/meminfo to determine the amount of memory
used by the page cache.

For cache hits and misses, there's a statistics file in
/proc/fs/xfs/stats that contains inode cache hits and misses
amongst other things. Those stats are somewhat documented here:


and you want to look at the inode operation stats. This script:


makes it easy to view them, even though it doesn't handle many of
the more recent additions.

As to tuning the size of the cache - it's pretty much a crap-shoot.
Firstly, you've got to have enough memory - XFS needs approximately
1-1.5GB RAM per million cached inodes (double that if you've got
lock debugging turned on).

The amount of RAM then used by the inode cache is then dependent on
memory pressure.  There's one knob that sometimes makes a difference
- it changes the balance between page cache vs inode cache
reclaimation: /proc/sys/vm/vfs_cache_pressure. From

        At the default value of vfs_cache_pressure=100 the kernel
        will attempt to reclaim dentries and inodes at a "fair" rate
        with respect to pagecache and swapcache reclaim.  Decreasing
        vfs_cache_pressure causes the kernel to prefer to retain
        dentry and inode caches. When vfs_cache_pressure=0, the
        kernel will never reclaim dentries and inodes due to memory
        pressure and this can easily lead to out-of-memory
        conditions. Increasing vfs_cache_pressure beyond 100 causes
        the kernel to prefer to reclaim dentries and inodes.

So you want to decrease vfs_cache_pressure to try to preserve the
inode cache rather than the page cache.


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>