[Top] [All Lists]

Re: relationship of nested stripe sizes, was: Question regarding XFS on

To: Chris Murphy <lists@xxxxxxxxxxxxxxxxx>
Subject: Re: relationship of nested stripe sizes, was: Question regarding XFS on LVM over hardware RAID.
From: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Date: Mon, 03 Feb 2014 04:50:54 -0600
Cc: xfs <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <98961D3F-769D-44A9-98A8-FC7867893138@xxxxxxxxxxxxxxxxx>
References: <7A732267-B34F-4286-9B49-3AF8767C0B89@xxxxxxxxxxxxxxxxx> <52ED4143.6090303@xxxxxxxxxxxxxxxxx> <EDBD7355-F1EC-4773-9138-CA864CB2E84B@xxxxxxxxxxxxxxxxx> <52ED6AAF.6030703@xxxxxxxxxxxxxxxxx> <98961D3F-769D-44A9-98A8-FC7867893138@xxxxxxxxxxxxxxxxx>
Reply-to: stan@xxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
On 2/2/2014 12:09 PM, Chris Murphy wrote:
> On Feb 1, 2014, at 2:44 PM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
> wrote:
>> On 2/1/2014 2:55 PM, Chris Murphy wrote:
>>> On Feb 1, 2014, at 11:47 AM, Stan Hoeppner
>>> <stan@xxxxxxxxxxxxxxxxx> wrote:
>>>> On 1/31/2014 12:35 AM, Chris Murphy wrote:
>>>>> Hopefully this is an acceptable way to avoid thread jacking,
>>>>> by renaming the  subject…
>>>>> On Jan 30, 2014, at 10:58 PM, Stan Hoeppner 
>>>>> <stan@xxxxxxxxxxxxxxxxx> wrote:
>>>>>> RAID60 is a nested RAID level just like RAID10 and RAID50.
>>>>>> It is a stripe, or RAID0, across multiple primary array
>>>>>> types, RAID6 in this case.  The stripe width of each
>>>>>> 'inner' RAID6 becomes the stripe unit of the 'outer' RAID0
>>>>>> array:
>>>>>> RAID6 geometry    128KB * 12 = 1536KB RAID0 geometry  1536KB
>>>>>> * 3 = 4608KB
>>>>> My question is on this particular point. If this were
>>>>> hardware raid6, but I wanted to then stripe using md raid0,
>>>>> using the numbers above would I choose a raid0 chunk size of
>>>>> 1536KB? How critical is this value for, e.g. only large
>>>>> streaming read/write workloads? If it were smaller, say 256KB
>>>>> or even 32KB, would there be a significant performance
>>>>> consequence?
>>>> You say 'if it were smaller...256/32KB'.  What is "it" 
>>>> referencing?
>>> it = chunk size for md raid0.
>>> So chunk size 128KB * 12 disks, hardware raid6. Chunk size 32KB
>>> [1] striping the raid6's with md raid0.
>> Frankly, I don't know whether you're pulling my chain, or really
>> don't understand the concept of nested striping.  I'll assume the
>> latter.
> The former would be inappropriate, and the latter is more plausible
> anyway, so this is the better assumption.
>> When nesting stripes, the chunk size of the outer stripe is
>> -always- equal to the stripe width of each inner striped array, as
>> I clearly demonstrated earlier:
> Except when it's hardware raid6, and software raid0, and the user
> doesn't know they need to specify the chunk size in this manner. And
> instead they use the mdadm default. What you're saying makes complete
> sense, but I don't think this is widespread knowledge or well
> documented anywhere that regular end users would know this by and
> large.

This is not widespread knowledge and is not well documented.  And by
definition "regular end users" are not creating RAID60 arrays.  In the
not too distant future there will be a little more information available
about proper geometry setup for RAID60.

>> 3 RAID6 arrays 
>> RAID6  geometry       128KB * 12 = 1536KB
>> RAID60 geometry      1536KB *  3 = 4608KB
>> mdadm allows you enough rope to hang yourself in this situation
>> because it doesn't know the geometry of the underlying hardware
>> arrays, and has no code to do sanity checking even if it did.  Thus
>> it can't save you from yourself.
> That's right, and this is the exact scenario I'm suggesting.
> Depending on version, mdadm has two possible default chunk sizes,
> either 64KB or 512KB.
> How bad is the resulting performance hit? Would a 64KB chunk be
> equally bad as a 512KB chunk? Or is this only quantifiable with
> testing (i.e. it could be a negligible performance hit, or it could
> be huge)?

>> RAID HBA and SAN controller firmware simply won't allow this.
>> They configure the RAID60 chunk size automatically equal to the
>> RAID6 stripe width.  If some vendor's firmware allows one to
>> manually enter the RAID60 chunk size with a value different from
>> the RAID6 stripe width, stay away from that vendor.
> I understand that, but the scenario and question I'm posing is for
> multiple hardware raid6's striped with md raid0. The use case are
> enclosures with raid6 but not raid60, so the enclosures are striped
> using software raid. I'm trying to understand the consequence
> magnitude when choosing an md raid0 chunk size other than the correct
> one. Is this a 5% performance hit, or a 30% performance hit?

You must answer other questions before you can answer those above, as
this is entirely workload dependent, as everything always is.  It also
depends on whether or how you aligned XFS.  The wrong outer stripe chunk
size may badly hurt performance, or it may not affect it much at all.
Depends on what/how you're writing, and the demands of the workload.


<Prev in Thread] Current Thread [Next in Thread>