On 09/25/12 17:01, Dave Chinner wrote:
On Tue, Sep 25, 2012 at 10:14:16AM -0500, Mark Tinguely wrote:
As a bonus, consolidating the loops into one worker actually gives a slight
Can you quantify it?
I was comparing the bonnie and iozone benchmarks outputs. I will see
if someone can enlighten me on how to quantify those numbers.
Don't bother. Those are two of the worst offenders in the "useless
benchmarks for regression testing" category. Yeah, they *look* like
they give decent numbers, but I've wasted so much time looking at
results from these benhmarks only to find they do basic things wrong
and give numbers that vary simple because you've made a change that
increases or decreases the CPU cache footprint of a code path.
e.g. IOZone uses the same memory buffer as the source/destination of
all it's IO, and does not touch the contents of it at all. Hence for
small IO, the buffer stays resident in the CPU caches and gives
unrealsitically high throughput results. Worse is the fact that CPU
cache residency of the buffer can change according to the kernel
code path taken, so you can get massive changes in throughput just
by changing the layout of the code without changing any logic....
IOZone can be useful if you know exactly what you are doing and
using it to test a specific code path with a specific set of
configurations. e.g. comparing ext3/4/xfs/btrfs on the same kernel
and storage is fine. However, the moment you start using it to
compare different kernels, it's a total crap shoot....
does anyone have a good benchmark XFS should use to share performance
results? A number that we can agree a series does not degrade the
lies, damn lies, statistics and then filesystem benchmarks?! :)
I guess I don't understand what you mean by "loop on
The problem I see above is this:
thread 1 worker 1 worker 2..max
loops here calling xfs_bmapi_alloc()
first loop it takes the lock
one of the next times through the above
loop it cannot get a worker. deadlock here.
I saved the xfs_bmalloca and fs_alloc_arg when
allocating a buffer to verify the paths.
blocks on AGF lock
<returns with AGF locked in transaction>
this does not need a worker, and since in the same
transaction all locks to the AGF buffer are recursive locks.
no wait here.
<deadlock as no more workers available>