[Top] [All Lists]

Re: Using xfs_growfs on SSD raid-10

To: stan@xxxxxxxxxxxxxxxxx
Subject: Re: Using xfs_growfs on SSD raid-10
From: Alexey Zilber <alexeyzilber@xxxxxxxxx>
Date: Thu, 10 Jan 2013 11:50:42 +0800
Cc: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=HAYdX2OHBoSBtlBboXU4ackbWtcYx8syr8zlnxBRoTA=; b=ogCyy0IUkRKU0hbirwe5YMQbbx2war8k4GOFeADPGoclWtl8vepPAh55fbtoNYxX6i JI/f2bQUI9LFllKQxfIoUjZNcmYkgcLDcLpZj4haSj9ipZYhbOlonkl/MTFNioggi0jD M21zyluVThBaEv5FoNQ6lfApW2qIQrGel0IjDxQhM0SLb0w62DFYO9SdnbJX8f7UzwnB PNQfr118J26bkph1DcoB9tO7cXXXC2yxGj+DT8qf5Vbg0pMO6HUqwqqV1d26OyVh2w1R HY+bk43nvJvBnIYJbCh7U+N72wp1EODQBRieeSF+0BF2JU3LLYpAnM8NxUqQRxbHTx2k 2OgQ==
In-reply-to: <50EE33BC.8010403@xxxxxxxxxxxxxxxxx>
References: <CAGdvdE3VnYKg8OXFZ-0eALuhK=Qdt-Apj0uwrB8Yfs=4Uun3UA@xxxxxxxxxxxxxx> <50EE33BC.8010403@xxxxxxxxxxxxxxxxx>
Hi Stan,

  Please see in-line:

On Thu, Jan 10, 2013 at 11:21 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
On 1/9/2013 7:23 PM, Alexey Zilber wrote:
> Hi All,
>   I've read the FAQ on 'How to calculate the correct sunit,swidth values
> for optimal performance' when setting up xfs on a RAID.  Thing is, I'm
> using LVM, and with the colo company we use, the easiest thing I've found,
> when adding more space, is to have another RAID added to the system, then
> I'll just pvcreate, expand the vgroup over it, lvextend and xfs_growfs and
> I'm done.  That is probably sub-optimal on an SSD raid.
> Here's the example situation.  I start off with a 6 (400GB) raid-10.  It's
> got 1M stripe sizes.  So I would start with pvcreate --dataalignment 1M
> /dev/sdb
> after all the lvm stuff I would do: mkfs.xfs -L mysql -d su=1m,sw=3
> /dev/mapper/db-mysql
> (so the above reflects the 3 active drives, and 1m stripe. So far so good?)
> Now, I need more space. We have a second raid-10 added, that's 4 (400gb)
> drives. So I do the same pvcreate --dataalignment 1M /dev/sdc
> then vgextend and lvextend, and finally; with xfs_growfs, there's no way to
> specify, change su/sw values.  So how do I do this?  I'd rather not use
> mount options, but is that the only way, and would that work?

It's now impossible to align the second array.  You have a couple of

Only the sw=3 is no longer valid, correct?  There's no way to add sw=5?
1.  Mount with "noalign", but that only affects data, not journal writes

Is "noalign" the default when no sw/su option is specified at all?

2.  Forget LVM and use the old tried and true UNIX method of expansion:
 create a new XFS on the new array and simply mount it at a suitable
place in the tree.

Not a possible solution.  The space is for a database and must be contiguous.

3.  Add 2 SSDs to the new array and rebuild it as a 6 drive RAID10 to
match the current array.  This would be the obvious and preferred path,

How is this the obvious and preferred path when I still can't modify the sw value?  Same problem.  Data loss or reformatting is not the preferred path, it defeats the purpose of using LVM.  Also, the potential for data loss by enlarging the raid array is huge.

assuming you actually mean 1MB STRIP above, not 1MB stripe.  If you

Stripesize 1MB
actually mean 1MB hardware RAID stripe, then the controller would have
most likely made a 768KB stripe with 256KB strip, as 1MB isn't divisible
by 3.  Thus you've told LVM to ship 1MB writes to a device expecting
256KB writes.  In that case you've already horribly misaligned your LVM
volumes to the hardware stripe, and everything is FUBAR already.  You
probably want to verify all of your strip/stripe configuration before
moving forward.

I don't believe you're correct here.  The SSD Erase Block size for the drives we're using is 1MB.   Why does being divisible by 3 matter?  Because of the number of drives?  Nowhere online have a seen anything about a 768MB+256MB stripe.  All the performance info I've seen point to it being the fastest.  I'm sure that wouldn't be the case if the controller had to deal with two stripes.

So essentially, my take-away here is that xfs_growfs doesn't work properly when adding more logical raid drives?   What kind of a performance hit am I looking at if sw is wrong?  How about this.  If I know that the maximum number of drives I can add is say 20 in a RAID-10.  Can I format with sw=10 (even though sw should be 3) in the eventual expectation of expanding it?  What would be the downside of doing that?



<Prev in Thread] Current Thread [Next in Thread>