On 11/25/2013 8:52 PM, Dave Chinner wrote:
> sunit/swidth is in filesystem blocks, not sectors. Hence
> sunit is 1MB, swidth = 2MB. While it's not quite correct
> (su=512k,sw=1m), it's not actually a problem...
Well that's what I thought as well, and I was puzzled by the 8 blocks
value for the log sunit. So I double checked before posting, and 'man
mkfs.xfs' told me
This is used to specify the stripe unit for a RAID device
or a logical volume. The value has to be specified in
512-byte block units.
So apparently the units of 'sunit' are different depending on which XFS
tool one is using. That's a bit confusing. And 'man xfs_info'
(xfs_growfs) doesn't tell us that sunit is given in filesystem blocks.
I'm using xfsprogs 3.1.4 so maybe these have been corrected since.
> Well, mkfs.xfs just uses what it gets from the kernel, so it
> might have been told the wrong thing by MD itself. However, you can
> modify sunit/swidth by mount options, so you can't directly trust
> what is reported from xfs_info to be what mkfs actually set
> Again, lsunit is in filesystem blocks, so it is 32k, not 4k. And
> yes, the default lsunit when the sunit > 256k is 32k. So, nothing
> wrong there, either.
So where should I have looked to confirm sunit reported by xfs_info is
in fs block (4KB) multiples, not the in the 512B multiples of mkfs.xfs?
> The usual: "iostat -x -d -m 5" output while the test is running.
> Also, you are using buffered IO, so changing it to use direct IO
> will tell us exactly what the disks are doing when Io is issued.
> blktrace is your friend here....
It'll be interesting to see where this troubleshooting leads. Buffered
single stream write speed is ~6x slower than read w/RAID10. That makes
me wonder if the controller and drive write caches have been disabled.
That could explain this.