On Tue, 2010-09-14 at 20:56 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> The buffer cache hash is showing typical hash scalability problems.
> In large scale testing the number of cached items growing far larger
> than the hash can efficiently handle. Hence we need to move to a
> self-scaling cache indexing mechanism.
> I have selected rbtrees for indexing becuse they can have O(log n)
> search scalability, and insert and remove cost is not excessive,
> even on large trees. Hence we should be able to cache large numbers
> of buffers without incurring the excessive cache miss search
> penalties that the hash is imposing on us.
> To ensure we still have parallel access to the cache, we need
> multiple trees. Rather than hashing the buffers by disk address to
> select a tree, it seems more sensible to separate trees by typical
> access patterns. Most operations use buffers from within a single AG
> at a time, so rather than searching lots of different lists,
> separate the buffer indexes out into per-AG rbtrees. This means that
> searches during metadata operation have a much higher chance of
> hitting cache resident nodes, and that updates of the tree are less
> likely to disturb trees being accessed on other CPUs doing
> independent operations.

I didn't review this time as carefully as I did
when you originally posted this.  Some parts from
the original are now in separate patches.  But this
looks good to me.

Reviewed-by: Alex Elder <aelder@xxxxxxx>

> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/linux-2.6/xfs_buf.c |  138 +++++++++++++++++++++----------------------
>  fs/xfs/linux-2.6/xfs_buf.h |    8 +--
>  fs/xfs/xfs_ag.h            |    4 +
>  fs/xfs/xfs_mount.c         |    2 +
>  4 files changed, 75 insertions(+), 77 deletions(-)

. . .

