xfs
[Top] [All Lists]

Re: [PATCH] [RFC] xfs: stop using the page cache to back the buffer cach

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH] [RFC] xfs: stop using the page cache to back the buffer cache
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 13 Jan 2011 08:15:10 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20110112131604.GA28675@xxxxxxxxxxxxx>
References: <1294817201-18670-1-git-send-email-david@xxxxxxxxxxxxx> <20110112131604.GA28675@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Jan 12, 2011 at 08:16:05AM -0500, Christoph Hellwig wrote:
> > -   if (bp->b_pages != bp->b_page_array) {
> > +   if (bp->b_pages && bp->b_pages != bp->b_page_array) {
> 
> If bp->p_ages is NULL it's per defintion different from b_page_array.

True. I'll drop that mod.

> 
> >  STATIC int
> > -_xfs_buf_lookup_pages(
> > +xfs_buf_allocate_buffer(
> 
> This is a bit misnamed and rather confusing vs xfs_buf_allocate.
> 
> Maybe call it xfs_buf_allocate_memory?

Yes, seems lik a better name.
> 
> > +   /*
> > +    * for buffers that are contained within a single page, just allocate
> > +    * the memory from the heap - there's no need for the complexity of
> > +    * page arrays to keep allocation down to order 0.
> > +    */
> > +   if (bp->b_buffer_length <= PAGE_SIZE) {
> 
> I think this should be a <, not <= - for page sized buffers avoiding
> the slab overhead should be much more efficient.

*nod*

The patch has been sitting around for a couple of weeks, and when I
did a quick look at it before posting it I wondered if I should make
that exact change before posting it. I decided not to because I
didn't want to wait for retesting before posting...

> 
> > +           page_count = 1;
> > +   } else {
> > +           end = bp->b_file_offset + bp->b_buffer_length;
> > +           page_count = xfs_buf_btoc(end) -
> > +                                   xfs_buf_btoct(bp->b_file_offset);
> > +   }
> >  
> >     error = _xfs_buf_get_pages(bp, page_count, flags);
> >     if (unlikely(error))
> >             return error;
> > +
> > +   if (bp->b_page_count == 1) {
> 
> I'd just duplicate the _xfs_buf_get_pages inside the branches to make
> this a lot cleaner.  The page_count calculation in the else clause can
> never end up as 1 anyway.  Maybe even make it two different functions
> and let the only caller decide which one to use.

OK, I'll rework the logic here to clean up the flow.

> > +           unsigned long   pageaddr;
> > +
> > +           bp->b_addr = kmem_alloc(bp->b_buffer_length, xb_to_km(flags));
> > +           if (!bp->b_addr)
> > +                   return ENOMEM;
> > +
> > +           pageaddr = (unsigned long)bp->b_addr & PAGE_MASK;
> > +           ASSERT(((unsigned long)(bp->b_addr + bp->b_buffer_length - 1) & 
> > PAGE_MASK) == pageaddr);
> > +
> > +           bp->b_offset = (unsigned long)bp->b_addr - pageaddr;
> 
> This can just be
> 
>               bp->b_offset = offset_in_page(bp->b_addr);
> 
> > +           bp->b_pages[0] = virt_to_page((void *)pageaddr);
> 
> and this
> 
>               bp->b_pages[0] = virt_to_page(bp->b_addr);
> 
> with that the above assert can be changed to not need a local variable
> and imho be slightly cleaner:
> 
>               ASSERT(((unsigned long)bp->b_addr + bp->b_buffer_length - 1) &
>                       PAGE_MASK) ==
>                      (unsigned long)bp->b_addr & PAGE_MASK);

Good idea, it can definitely be made cleaner.

> although I'm not sure it's actually correct.  slub is a pretty
> aggressive in using high order pages and might give us back memory that
> goes across multiple pages.  Adding a loop here to assign multiple pages
> if needed seems safer.

Hmmm. I've been using SLUB and not seen any assert triggers. I put
the assert there because I was concerned about whether this could
occur, but the heap is backed by power-of-two sized slabs so I think
that slub doesn't trigger it.

I'll have a think about how to handle buffers that span multiple
pages cleanly - like you said a loop is probably the easiest way to
handle it.

> > +           bp->b_flags |= XBF_MAPPED|_XBF_KMEM;
> 
> Space around the | operator, please.
> 
> > +           bp->b_flags &= XBF_MAPPED|_XBF_KMEM|_XBF_PAGES;
> 
> here, too.

Will do. As you've noticed, the code is pretty rough still. :/

> Additional comments:
> 
>  - the _XBF_PAGE_CACHE and _XBF_PAGE_LOCKED flags are now unused and can
>    be removed.
>  - the comment describing xfs_buf_t around line 141 in xfs_buf.h is now
>    incorrect.  It's not overly useful so I'd suggest removing it
>    completely.
>  - the comments above the buffer locking functions in xfs_buf.c
>    (including the meta-comment) are now incorrect and should be
>    fixed/removed.
>  - the call to mark_page_accessed in xfs_buf_get is now superflous,
>    as it only has a meaning for pagecache pages.

I'll clean them up for the next version. Thanks for the initial
sanity check, Christoph.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>