On Tue, Aug 16, 2011 at 10:06:52PM +0800, Wu Fengguang wrote:
> I tend to agree with the whole patchset except for this one.
> The worry comes from the fact that there are always the very possible
> unevenly distribution of dirty pages throughout the LRU lists.
It is pages under writeback that determines if throttling is considered
not dirty pages. The distinction is important. I agree with you that if
it was dirty pages that throttling would be considered too regularly.
> patch works on local information and may unnecessarily throttle page
> reclaim when running into small spans of dirty pages.
It's also calling wait_iff_congested() not congestion_wait(). This
takes BDI congestion and zone congestion into account with this check.
* If there is no congestion, or heavy congestion is not being
* encountered in the current zone, yield if necessary instead
* of sleeping on the congestion queue
if (atomic_read(&nr_bdi_congested[sync]) == 0 ||
So global information is being taken into account.
> One possible scheme of global throttling is to first tag the skipped
> page with PG_reclaim (as you already do). And to throttle page reclaim
> only when running into pages with both PG_dirty and PG_reclaim set,
It's PG_writeback that is looked at, not PG_dirty.
> which means we have cycled through the _whole_ LRU list (which is the
> global and adaptive feedback we want) and run into that dirty page for
> the second time.
This potentially results in more scanning from kswapd before it starts
throttling which could consume a lot of CPU. If pages under writeback
are reaching the end of the LRU, it's already the case that kswapd is
scanning faster than pages can be cleaned. Even then, it only really
throttles if the zone or a BDI is congested.
Taking that into consideration, do you still think there is a big
advantage to having writeback pages take another lap around the LRU
that is justifies the expected increase in CPU usage?