On Tue, May 27, 2014 at 08:26:53AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> Upon memory pressure, kswapd calls xfs_vm_writepage() from
> shrink_page_list(). This can result in delayed allocation occurring
> and that gets deferred to the the allocation workqueue.
> The allocation then runs outside kswapd context, which means if it
> needs memory (and it does to demand page metadata from disk) it can
> block in shrink_inactive_list() waiting for IO congestion. These
> blocking waits are normally avoiding in kswapd context, so under
> memory pressure writeback from kswapd can be arbitrarily delayed by
> memory reclaim.
> To avoid this, pass the kswapd context to the allocation being done
> by the workqueue, so that memory reclaim understands correctly that
> the work is being done for kswapd and therefore it is not blocked
> and does not delay memory reclaim.
> To avoid issues with int->char conversion of flag fields (as noticed
> in v1 of this patch) convert the flag fields in the struct
> xfs_bmalloca to bool types. pahole indicates these variables are
> still single byte variables, so no extra space is consumed by this
Reviewed-by: Christoph Hellwig <hch@xxxxxx>