xfs hardware RAID alignment over linear lvm
Stewart Webb
stew at messeduphare.co.uk
Mon Sep 30 03:48:58 CDT 2013
Ok,
Thanks Stan
Much appreciated
On 28 September 2013 15:54, Stan Hoeppner <stan at hardwarefreak.com> wrote:
> On 9/27/2013 8:29 AM, Stewart Webb wrote:
> > Hi Stan,
> >
> > Apologies for not directly answering -
>
> No problem, sorry for the late reply.
>
> > I was aiming at filling gaps in my knowledge that I could not find in the
> > xfs.org wiki.
>
> Hopefully this is occurring. :)
>
> > My workload for the storage is mainly reads of single large files
> (ranging
> > for 20GB to 100GB each)
> > These reads are mainly linear (video playback, although not always as the
> > end user may be jumping to different points in the video)
> > There are concurrent reads required, estimated at 2 to 8, any more would
> be
> > a bonus.
>
> This is the type of workload Dave described previously that should
> exhibit an increase in read performance if the files are written with
> alignment, especially with concurrent readers, which you describe as
> 2-8, maybe more. The number of "maybe more" is dictated by whether
> you're aligned. I.e. with alignment your odds of successfully serving
> more readers is much greater.
>
> Thus, if you need to stitch arrays together with LVM concatenation,
> you'd definitely benefit from making the geometry of all arrays
> identical, and aligning the filesystem to that geometry. I.e. same
> number of disks, same RAID level, same RAID stripe unit (data per non
> parity disk), and stripe width (#non parity disks).
>
> > The challenge of this would be that the reads need to be "real-time"
> > operations as they are interacted with by a person, and each
> > read operation would have to consistently have a low latency and obtain
> > speeds of over 50Mb/s
> >
> > Disk write speeds are not *as* important for me - as they these files are
> > copied to location before they are required (in this case
> > using rsync or scp) and these operations do not require as much
> "real-time"
> > interaction.
> >
> >
> > On 27 September 2013 14:09, Stan Hoeppner <stan at hardwarefreak.com>
> wrote:
> >
> >> On 9/27/2013 7:23 AM, Stewart Webb wrote:
> >>>> Right, and it does so not only to improve write performance, but to
> >>>> also maximise sequential read performance of the data that is
> >>>> written, especially when multiple files are being read
> >>>> simultaneously and IO latency is important to keep low (e.g.
> >>>> realtime video ingest and playout).
> >>>
> >>> So does this mean that I should avoid having devices in RAID with a
> >>> differing amount of spindles (or non-parity disks)
> >>> If I would like to use Linear concatenation LVM? Or is there a best
> >>> practice if this instance is not
> >>> avoidable?
> >>
> >> Above, Dave was correcting my oversight, not necessarily informing you,
> >> per se. It seems clear from your follow up question that you didn't
> >> really grasp what he was saying. Let's back up a little bit.
> >>
> >> What you need to concentrate on right now is the following which we
> >> stated previously in the thread, but which you did not reply to:
> >>
> >>>>>> What really makes a difference as to whether alignment will be of
> >>>>>> benefit to you, and how often, is your workload. So at this point,
> >> you
> >>>>>> need to describe the primary workload(s) of your systems we're
> >>>> discussing.
> >>>>>
> >>>>> Yup, my thoughts exactly...
> >>
> >> This means you need to describe in detail how you are writing your
> >> files, and how you are reading them back. I.e. what application are you
> >> using, what does it do, etc. You stated IIRC that your workload is 80%
> >> read. What types of files is it reading? Small, large? Is it reading
> >> multiple files in parallel? How are these files originally written
> >> before being read? Etc, etc.
> >>
> >> You may not understand why this is relevant, but it is the only thing
> >> that is relevant, at this point. Spindles, RAID level, alignment, no
> >> alignment...none of this matters if it doesn't match up with how your
> >> application(s) do their IO.
> >>
> >> Rule #1 of storage architecture: Always build your storage stack (i.e.
> >> disks, controller, driver, filesystem, etc) to fit the workload(s), not
> >> the other way around.
> >>
> >>>
> >>> On 27 September 2013 02:10, Stan Hoeppner <stan at hardwarefreak.com>
> >> wrote:
> >>>
> >>>> On 9/26/2013 4:58 PM, Dave Chinner wrote:
> >>>>> On Thu, Sep 26, 2013 at 04:22:30AM -0500, Stan Hoeppner wrote:
> >>>>>> On 9/26/2013 3:55 AM, Stewart Webb wrote:
> >>>>>>> Thanks for all this info Stan and Dave,
> >>>>>>>
> >>>>>>>> "Stripe size" is a synonym of XFS sw, which is su * #disks. This
> is
> >>>> the
> >>>>>>>> amount of data written across the full RAID stripe (excluding
> >> parity).
> >>>>>>>
> >>>>>>> The reason I stated Stripe size is because in this instance, I have
> >>>> 3ware
> >>>>>>> RAID controllers, which refer to
> >>>>>>> this value as "Stripe" in their tw_cli software (god bless
> >>>> manufacturers
> >>>>>>> renaming everything)
> >>>>>>>
> >>>>>>> I do, however, have a follow-on question:
> >>>>>>> On other systems, I have similar hardware:
> >>>>>>> 3x Raid Controllers
> >>>>>>> 1 of them has 10 disks as RAID 6 that I would like to add to a
> >> logical
> >>>>>>> volume
> >>>>>>> 2 of them have 12 disks as a RAID 6 that I would like to add to the
> >>>> same
> >>>>>>> logical volume
> >>>>>>>
> >>>>>>> All have the same "Stripe" or "Strip Size" of 512 KB
> >>>>>>>
> >>>>>>> So if I where going to make 3 seperate xfs volumes, I would do the
> >>>>>>> following:
> >>>>>>> mkfs.xfs -d su=512k sw=8 /dev/sda
> >>>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdb
> >>>>>>> mkfs.xfs -d su=512k sw=10 /dev/sdc
> >>>>>>>
> >>>>>>> I assume, If I where going to bring them all into 1 logical volume,
> >> it
> >>>>>>> would be best placed to have the sw value set
> >>>>>>> to a value that is divisible by both 8 and 10 - in this case 2?
> >>>>>>
> >>>>>> No. In this case you do NOT stripe align XFS to the storage,
> because
> >>>>>> it's impossible--the RAID stripes are dissimilar. In this case you
> >> use
> >>>>>> the default 4KB write out, as if this is a single disk drive.
> >>>>>>
> >>>>>> As Dave stated, if you format a concatenated device with XFS and you
> >>>>>> desire to align XFS, then all constituent arrays must have the same
> >>>>>> geometry.
> >>>>>>
> >>>>>> Two things to be aware of here:
> >>>>>>
> >>>>>> 1. With a decent hardware write caching RAID controller, having XFS
> >>>>>> alined to the RAID geometry is a small optimization WRT overall
> write
> >>>>>> performance, because the controller is going to be doing the
> >> optimizing
> >>>>>> of final writeback to the drives.
> >>>>>>
> >>>>>> 2. Alignment does not affect read performance.
> >>>>>
> >>>>> Ah, but it does...
> >>>>>
> >>>>>> 3. XFS only performs aligned writes during allocation.
> >>>>>
> >>>>> Right, and it does so not only to improve write performance, but to
> >>>>> also maximise sequential read performance of the data that is
> >>>>> written, especially when multiple files are being read
> >>>>> simultaneously and IO latency is important to keep low (e.g.
> >>>>> realtime video ingest and playout).
> >>>>
> >>>> Absolutely correct, as Dave always is. As my workloads are mostly
> >>>> random, as are those of others I consult in other fora, I sometimes
> >>>> forget the [multi]streaming case. Which is not good, as many folks
> >>>> choose XFS specifically for [multi]streaming workloads. My remarks to
> >>>> this audience should always reflect that. Apologies for my oversight
> on
> >>>> this occasion.
> >>>>
> >>>>>> What really makes a difference as to whether alignment will be of
> >>>>>> benefit to you, and how often, is your workload. So at this point,
> >> you
> >>>>>> need to describe the primary workload(s) of your systems we're
> >>>> discussing.
> >>>>>
> >>>>> Yup, my thoughts exactly...
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Dave.
> >>>>>
> >>>>
> >>>> --
> >>>> Stan
>
>
--
Stewart Webb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130930/0c2b2f7f/attachment.html>
More information about the xfs
mailing list