xfs
[Top] [All Lists]

Re: Advice sought on how to lock multiple pages in ->prepare_write and -

To: Sonny Rao <sonny@xxxxxxxxxxx>
Subject: Re: Advice sought on how to lock multiple pages in ->prepare_write and ->writepage
From: Bryan Henderson <hbryan@xxxxxxxxxx>
Date: Mon, 31 Jan 2005 17:32:31 -0800
Cc: Anton Altaparmakov <aia21@xxxxxxxxx>, Andrew Morton <akpm@xxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, linux-fsdevel-owner@xxxxxxxxxxxxxxx, linux-xfs@xxxxxxxxxxx, viro@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
In-reply-to: <20050201001053.GB11044@kevlar.burdell.org>
Sender: linux-xfs-bounce@xxxxxxxxxxx
Thanks for the numbers, though there are enough variables here that it's 
hard to make any hard conclusions.

When I've seen these comparisons in the past, it turned out to be one of 
two things:

1) The system with the smaller I/Os (I/O = unit seen by the device) had 
more CPU time per megabyte in the code path to start I/O, so that it 
started less I/O.  The small I/Os are a consequence of the lower 
throughput, not a cause.  You can often rule this out just by looking at 
CPU utilization.

2) The system with the smaller I/Os had a window tuning problem in which 
it was waiting for previous I/O to complete before starting more, with 
queues not full, and thus starting less I/O.  Some devices, with good 
intentions, suck the Linux queue dry, one tiny I/O at a time, and then 
perform miserably processing those tiny I/Os.  Properly tuned, the device 
would buffer fewer I/Os and thus let the queues build inside Linux and 
thus cause Linux to send larger I/Os.

People have done ugly queue plugging algorithms to try to defeat this 
queue sucking by withholding I/O from a device willing to take it.  Others 
defeat it by withholding I/O from a willing Linux block layer, instead 
saving up I/O and submitting it in large bios.

>Ext3 (writeback mode)
>
>Device:    rrqm/s   wrqm/s   r/s    w/s  rsec/s    wsec/s    rkB/s wkB/s 
avgrq-sz avgqu-sz   await  >svctm  %util
>sdc          0.00 21095.60 21.00 244.40  168.00 170723.20    84.00 
85361.60   643.90    11.15   42.15   > 3.45  91.60
>
>We see 21k merges per second going on, and an average request size of 
>only 643 sectors where the device can handle up to 1Mb (2048 sectors).
>
>Here is iostat from the same test w/ JFS instead:
>
>Device:    rrqm/s  wrqm/s   r/s   w/s  rsec/s    wsec/s    rkB/s wkB/s 
avgrq-sz avgqu-sz   await  >svctm  %util
>sdc          0.00 1110.58  0.00 97.80    0.00 201821.96     0.00 
100910.98  2063.53   117.09 1054.11  >10.21  99.84
>
>So, in this case I think it is making a difference 1k merges and a big 
difference in
>throughput, though there could be other issues. 

--
Bryan Henderson                          IBM Almaden Research Center
San Jose CA                              Filesystems


<Prev in Thread] Current Thread [Next in Thread>