xfs
[Top] [All Lists]

Re: swidth in RAID

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: swidth in RAID
From: aurfalien <aurfalien@xxxxxxxxx>
Date: Sun, 30 Jun 2013 18:54:31 -0700
Cc: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=JLVlo1bT9WB4SchSwNBvuDh2jUdBojihwsSF4+rW39o=; b=hFDZxV66D+cCyVc3VsAcC9gZk1JrAMjTmrOEyvkKDcl49qYqOAZqu3JcAz6ri7wNzH rKjlo7M2XL23RqZERCBSGYIIUMrDXtbUISsrbVryEy7iOFF55ohETctqWbDUT7Mguo/t 0YaSM1X0Dbl6cZ9CNrKtiCmTOd8dUxDAd/hicCIx+1OzuafcYU/lIxbRdqAreuphNW08 IxvieYOyL3tHuWkolrHPP3/fChs3qq8HHBoeJ230CLGTtul3L51oDQ/uONHtCfDCAuUq nou10cSzJm+3MN9EuXWgXHqmWrRgle4zHoSy1nhTM0H2Le2+Lap6G3hQDAgfYuAPiUMN xxcQ==
In-reply-to: <20130701013851.GC27780@dastard>
References: <557F888F-34EA-4669-B861-C0B684DAD13D@xxxxxxxxx> <51D0A62E.2020309@xxxxxxxxxxxxxxxxx> <20130701013851.GC27780@dastard>
On Jun 30, 2013, at 6:38 PM, Dave Chinner wrote:

> On Sun, Jun 30, 2013 at 04:42:06PM -0500, Stan Hoeppner wrote:
>> On 6/30/2013 1:43 PM, aurfalien wrote:
>> 
>>> I understand swidth should = #data disks.
>> 
>> No.  "swidth" is a byte value specifying the number of 512 byte blocks
>> in the data stripe.
>> 
>> "sw" is #data disks.
>> 
>>> And the docs say for RAID 6 of 8 disks, that means 6.
>>> 
>>> But parity is distributed and you actually have 8 disks/spindles working 
>>> for you and a bit of parity on each.
>>> 
>>> So shouldn't swidth equal disks in raid when its concerning distributed 
>>> parity raid?
>> 
>> No.  Lets try visual aids.
>> 
>> Set 8 coffee cups (disk drives) on a table.  Grab a bag of m&m's.
>> Separate 24 blues (data) and 8 reds (parity).
>> 
>> Drop a blue m&m in cups 1-6 and a red into 7-8.  You just wrote one RAID
>> stripe.  Now drop a blue into cups 3-8 and a red in 1-2.  Your second
>> write, this time rotating two cups (drives) to the right.  Now drop
>> blues into 5-2 and reds into 3-4.  You've written your third stripe,
>> rotating by two cups (disks) again.
>> 
>> This is pretty much how RAID6 works.  Each time we wrote we dropped 8
>> m&m's into 8 cups, 6 blue (data chunks) and 2 red (parity chunks).
>> Every RAID stripe you write will be constructed of 6 blues and 2 reds.
> 
> Right, that's how they are constructed, but not all RAID distributes
> parity across different disks in the array. Some are symmetric, some
> are asymmetric, some rotate right, some rotate left, and some use
> statistical algorithms to give an overall distribution without being
> able to predict where a specific parity block might lie within a
> stripe...
> 
> And at the other end of the scale, isochronous RAID arrays tend to
> have dedicated parity disks so that data read and write behaviour is
> deterministic and therefore predictable from a high level....
> 
> So, assuming that a RAID5/6 device has a specific data layout (be it
> distributed or fixed) at the filesystem level is just a bad idea. We
> simply don't know. Even if we did, the only thing we can optimise is
> the thing that is common between all RAID5/6 devices - writing full
> stripe widths is the most optimal method of writing to them....

Am I interpreting this to say;

16 disks is sw=16 regardless of parity?

As the thing common is number of disks.   Or 1 parity as the least common denom 
which would mean sw=15?

Peter brought this up;

The main goal is trying to the reduce the probability of
read-modify-write.

Which is a way for me to think it as  "don't over subscribe".

- aurf

<Prev in Thread] Current Thread [Next in Thread>