[Top] [All Lists]

Re: ag selection

To: xfs@xxxxxxxxxxx
Subject: Re: ag selection
From: Bernd Schubert <bernd.schubert@xxxxxxxxxxxxxxxxxx>
Date: Mon, 11 Nov 2013 19:23:51 +0100
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20131111175550.GB16643@xxxxxxxxxxxxxxxxxx>
References: <l5r3tf$m0j$1@xxxxxxxxxxxxx> <20131111175313.GA16643@xxxxxxxxxxxxxxxxxx> <20131111175550.GB16643@xxxxxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.0
On 11/11/2013 06:55 PM, Carlos Maiolino wrote:
> On Mon, Nov 11, 2013 at 03:53:14PM -0200, Carlos Maiolino wrote:
>> On Mon, Nov 11, 2013 at 06:25:13PM +0100, Bernd Schubert wrote:
>>> Hi all,
>>> for streaming writes onto a raid6 the current round-robin ag
>>> selection seems does not seem to be optimal. Writing 4 files from 4
>>> threads into a single directory we get 900 MB/s, writing 4 files in
>>> 4 different directories we only get 700 MB/s (12 disks with with hw
>>> megaraid-sas). The current round-robin scheme seems to be optimized
>>> for linear raid0? With small AGs one could also argue, that choosing
>>> AGs which are not far away from each other (in respect to the number
>>> of blocks) also adds more parallel disk access for small and medium
>>> sized files.
>>> Any objections against a patch to improve the AG selection?
>> I wouldn't say this it is optimized specifically for raid 0 environments but 
>> I
>> lack some knowledge on this choice. The mainly reason for the round-robing 
>> IIRC,
>> was to avoid lock contention in a single AG. spreading different files along 
>> the
>> whole disk, and also making it able to allocate them contiguously along the 
>> disk.
> Lock contention in inodes and blocks B-Trees for example, improving 
> parallelism
> in the filesystem, but of course this might not be the optimal behavior for 
> all

Agreed, more locks help to avoid that.

> environments. That's why XFS has a long list of tuning mkfs/mount options :-)
>> But, I'm not sure what kind of optimization you have in mind and I believe
>> another engineers will also need some extra information about what 
>> optimization
>> you have in mind, what kind of tests you're doing (Direct I/O, buffered,
>> pre-allocation), etc.. You'll also need to post filesystem configurations 
>> like
>> FS aligment (su, sw options), etc.

One of my colleagues benchmarked this on one of our fast systems and another 
colleague current needs this system for other tests, so I don't have the 
exact parameters. However, it was for sure formated with options like these:

mkfs.xfs -d su=256k,sw=10 -l version=2,su=256k -isize=512 /dev/sdX

and mounted with these options:

mount -onoatime,nodiratime,largeio,inode64,swalloc,allocsize=131072k,nobarrier 
/dev/sdX <mountpoint>

>> For different write patterns, you might also want to take a look at the
>> rotor_step procfs option, and some other options dedicated to streaming 
>> writes,
>> that might help you in this case.

Thanks, I didn't know that knob, I'm going to look into it. 
According to the comments its for inode32 only, but I need 
to read the xfs_alloc code first to see what it actually 


<Prev in Thread] Current Thread [Next in Thread>