xfs
[Top] [All Lists]

Re: _pagebuf_lookup_pages() allocation flags

To: Marcelo Tosatti <marcelo@xxxxxxxxxxxxxxxx>
Subject: Re: _pagebuf_lookup_pages() allocation flags
From: Steve Lord <lord@xxxxxxx>
Date: Thu, 22 Feb 2001 09:49:36 -0600
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: Message from Marcelo Tosatti <marcelo@conectiva.com.br> of "Fri, 16 Feb 2001 18:11:17 -0200." <Pine.LNX.4.21.0102161755110.769-100000@freak.distro.conectiva>
Sender: owner-linux-xfs@xxxxxxxxxxx
> 
> Hi,
> 
> I noticed that _pagebuf_lookup_pages() may use two different allocation
> flags to allocate invalid pages depending on PBF_MAPPABLE flag:
> 
>         /* For pagebufs where we want to map an address, do not use
>          * highmem pages - so that we do not need to use kmap resources
>          * to access the data.
>          */
> 
>         if (flags & PBF_MAPPABLE) {
>                 gfp_mask = GFP_BUFFER;
>         } else {
>                 gfp_mask = GFP_HIGHUSER;
>       }
> 
> 
> My question is if only when the caller sets PBF_MAPPABLE it may hold some
> fs lock? (thats why GFP_BUFFER was used, I suppose)
> 
> If callers which do not set PBF_MAPPABLE may have locks which are used on
> the ->writepage() codepath, it may be a problem (deadlock).
> 
> I tried to track down the callers, but I want to read pagebuf code for now
> and the whole XFS code :)
> 
> 

The background here is that most metadata requests for a buffer in xfs use the
PBF_MAPPABLE flag, it is intended to mean that we will be manipulating the
buffer contents directly from kernel space, so must have non-highmem pages.
There are actually a couple of cases where we also remap the pages to be
contiguous in kernel space - this would be nice to get rid of, it is on my
todo list.

File data buffers are usually requested with the xfs iolock held on the
inode, this does not cause any problems for page_launder coming back into
the filesystem and asking us to remove other pages from the cache. Metadata
buffers are often requested with filesystem locks held, sometimes on other
metadata buffers, other times the xfs ilock on an inode, all of these
can potentially cause deadlocks if we get a request to flush a delalloc
page out to disk as the conversion from delalloc to real disk space may 
use the same locks.

We did originally use GFP_KERNEL, but deadlocks arose, which is why it
is GFP_BUFFER now. Other kernel changes may have removed this requirement
now, but I doubt it.

The reason this works in Irix is that the memory reclaim path there will
recognize delalloc data and hand it off to xfs to convert and flush to disk,
it does not expect it to become clean immediately. Irix also recognizes
metadata objects which are not in a flushable state and skips them in this
path. Note also that Irix does these operations on buffers (which can be
variable size), not on individual pages of memory.

Steve






<Prev in Thread] Current Thread [Next in Thread>