On 06/15/2012 02:30 PM, Dave Chinner wrote:
On Fri, Jun 15, 2012 at 01:25:26PM +0200, Bernd Schubert wrote:
On 06/15/2012 02:16 AM, Dave Chinner wrote:
Oh, I just noticed you are might be using CFQ (it's the default in
dmesg). Don't - CFQ is highly unsuited for hardware RAID - it's
hueristically tuned to work well on sngle SATA drives. Use deadline,
or preferably for hardware RAID, noop.
I'm not sure if noop is really a good recommendation even with hw
raid, especially if the the request queue size is high. This week I
did some benchmarks with a high rq write size (triggered with
sync_file_range(..., SYNC_FILE_RANGE_WRITE) ) and with noop
concuring reads then almost entirely got stalled.
With deadline read/write balance was much better, although writes
still had been preferred (with sync_file_range() and without). I
always thought deadline prefers reads and I hope I find some time
later on to investigate further what was going on.
Test had been on a netapp E5400 hw raid, so rather high end hw raid.
Sounds like a case of the IO scheduler queue and/or CTQ being too
Hmm yes probably. With a small request queue and the usage of
sync_file_range(..., SYNC_FILE_RANGE_WRITE) we only have a small page
cache buffer. And sync_file_range is required to get perfect IO sizes as
given by max_sectors_kb. Without sync_file_range IOs have more or less
random size, but very rarely aligned to the raid-stripe-size (and yes,
mkfs.xfs options are correctly set). That is another issue I need to
find time to investigate.