On 11/11/2013 06:55 PM, Carlos Maiolino wrote:
> On Mon, Nov 11, 2013 at 03:53:14PM -0200, Carlos Maiolino wrote:
>> On Mon, Nov 11, 2013 at 06:25:13PM +0100, Bernd Schubert wrote:
>>> Hi all,
>>>
>>> for streaming writes onto a raid6 the current round-robin ag
>>> selection seems does not seem to be optimal. Writing 4 files from 4
>>> threads into a single directory we get 900 MB/s, writing 4 files in
>>> 4 different directories we only get 700 MB/s (12 disks with with hw
>>> megaraid-sas). The current round-robin scheme seems to be optimized
>>> for linear raid0? With small AGs one could also argue, that choosing
>>> AGs which are not far away from each other (in respect to the number
>>> of blocks) also adds more parallel disk access for small and medium
>>> sized files.
>>>
>>> Any objections against a patch to improve the AG selection?
>>>
>>
>> I wouldn't say this it is optimized specifically for raid 0 environments but
>> I
>> lack some knowledge on this choice. The mainly reason for the round-robing
>> IIRC,
>> was to avoid lock contention in a single AG. spreading different files along
>> the
>> whole disk, and also making it able to allocate them contiguously along the
>> disk.
>>
> Lock contention in inodes and blocks B-Trees for example, improving
> parallelism
> in the filesystem, but of course this might not be the optimal behavior for
> all
Agreed, more locks help to avoid that.
> environments. That's why XFS has a long list of tuning mkfs/mount options :-)
>
>> But, I'm not sure what kind of optimization you have in mind and I believe
>> another engineers will also need some extra information about what
>> optimization
>> you have in mind, what kind of tests you're doing (Direct I/O, buffered,
>> pre-allocation), etc.. You'll also need to post filesystem configurations
>> like
>> FS aligment (su, sw options), etc.
One of my colleagues benchmarked this on one of our fast systems and another
colleague current needs this system for other tests, so I don't have the
exact parameters. However, it was for sure formated with options like these:
mkfs.xfs -d su=256k,sw=10 -l version=2,su=256k -isize=512 /dev/sdX
and mounted with these options:
mount -onoatime,nodiratime,largeio,inode64,swalloc,allocsize=131072k,nobarrier
/dev/sdX <mountpoint>
>>
>> For different write patterns, you might also want to take a look at the
>> rotor_step procfs option, and some other options dedicated to streaming
>> writes,
>> that might help you in this case.
Thanks, I didn't know that knob, I'm going to look into it.
According to the comments its for inode32 only, but I need
to read the xfs_alloc code first to see what it actually
does.
Thanks,
Bernd
|