On Sat, Oct 29, 2011 at 04:39:44AM +0800, Wu Fengguang wrote:
> [restore CC list]
> > > I'm trying to understand where the performance gain comes from.
> > >
> > > I noticed that in all cases, before/after patchset, nr_vmscan_write are
> > > all zero.
> > >
> > > nr_vmscan_immediate_reclaim is significantly reduced though:
> > That's a good thing, it means we burn less CPU time on skipping
> > through dirty pages on the LRU.
> > Until a certain priority level, the dirty pages encountered on the LRU
> > list are marked PageReclaim and put back on the list, this is the
> > nr_vmscan_immediate_reclaim number. And only below that priority, we
> > actually ask the FS to write them, which is nr_vmscan_write.
> Yes, it is.
> > I suspect this is where the performance improvement comes from: we
> > find clean pages for reclaim much faster.
> That explains how it could reduce CPU overheads. However the dd's are
> throttled anyway, so I still don't understand how the speedup of dd page
> allocations improve the _IO_ performance.
They are throttled in balance_dirty_pages() when there are too many
dirty pages. But they are also 'throttled' in direct reclaim when
there are too many clean + dirty pages. Wild guess: speeding up
direct reclaim allows dirty pages to be generated faster and the
writer can better saturate the BDI?
Not all filesystems ignore all VM writepage requests, either. xfs
e.g. ignores only direct reclaim but honors requests from kswapd.
ext4 honors writepage whenever it pleases. On those, I can imagine
the reduced writepage intereference to help. But that can not be the
only reason as btrfs ignores writepage from the reclaim in general and
still sees improvement.