[Top] [All Lists]

Re: [PATCH 4/4] xfs: convert buffer cache hash to rbtree

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 4/4] xfs: convert buffer cache hash to rbtree
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Wed, 8 Sep 2010 21:51:50 -0400
Cc: xfs@xxxxxxxxxxx
In-reply-to: <1283958778-28610-5-git-send-email-david@xxxxxxxxxxxxx>
References: <1283958778-28610-1-git-send-email-david@xxxxxxxxxxxxx> <1283958778-28610-5-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-08-17)
On Thu, Sep 09, 2010 at 01:12:58AM +1000, Dave Chinner wrote:
> I have selected rbtrees for indexing becuse they can have O(log n)
> search scalability, and insert and remove cost is not excessive,
> even on large trees. Hence we should be able to cache large numbers
> of buffers without incurring the excessive cache miss search
> penalties that the hash is imposing on us.

Once thing that worries me about the rbtrees is that the Linux
implementation doesn't allow for lockless readers.  But in the end the
buffer cache implementation is very well encapsulated, so if the need
arises we could easily change the underlying data structure.

> +     /*
> +      * The buftarg cache should never be used by external devices.
> +      * Ensure we catch any users with extreme prejudice.
> +      */
> +     btp->bt_mp = external ? NULL : mp;

I'd much prefer to always initialize this field.  We currently have a
b_mount field struct xfs_buf which is used only in a few places
and initialized rather, ehmm, lazily.  If we could replace it with
->b_target->bt_mount we can shrink struct buf and make the information
available much more consistently.  Just adding the mount argument
to the buftarg and removing it from the buf would be a nice little
preparatory patch.

And yes, I think bt_mount would be much nicer name than bt_mp.

> @@ -210,8 +210,6 @@ xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno)
>       pag = radix_tree_lookup(&mp->m_perag_tree, agno);
>       if (pag) {
>               ASSERT(atomic_read(&pag->pag_ref) >= 0);
> -             /* catch leaks in the positive direction during testing */
> -             ASSERT(atomic_read(&pag->pag_ref) < 1000);

Di you manage to hit this during testing?  Either way it should probably
be a separate patch.

<Prev in Thread] Current Thread [Next in Thread>