xfs
[Top] [All Lists]

Re: Help with XFS in VMs on VMFS

To: xfs@xxxxxxxxxxx
Subject: Re: Help with XFS in VMs on VMFS
From: Jan Perci <jperci@xxxxxxxxx>
Date: Thu, 28 Mar 2013 23:30:01 -0400
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=vQOqT0izAu4vjLVneqSXB07pKMvAw59Xa36/QCsq9oY=; b=glQ3cxa1Ipb8PXScEhg2tDd2TT5N95Z13taEqX9/UQPO1/vvbhYdZ/06Dw3HOKWMDZ a/s/mefDVBeuVaR+r0JP0TEHvtiJbnCXqX2KHIE/KKYArIo7tt6RKRF1sJoyJOQo4Nmq 2ybxUWILiKXjiC/aPrnvll1AkVsGAY6ePQMlBJyHecdlP6tHm3CFnKmFE/0am6yCX62J Q2/wS3EpFxHQnZyra4D1vA4g1+4KeysgvdOqfBeFuD+4aVvLKMu2yZmpVawWk02f7lv1 q7jCV6F9tAtyzilzzVK12gpWBNTOfOi5GQwMGOxjnWar4IZDMwUnZ9GfYhlEGCmT0Ig9 eRMQ==
In-reply-to: <5154E6AC.9020402@xxxxxxxxxxxxxxxxx>
References: <CAJoqCq9WhFi8yZnTjh_dJmOte4TWpKg3qsQLpVsZ45M8XoWiaA@xxxxxxxxxxxxxx> <51549F09.1090109@xxxxxxxxxxxxxxxxx> <20130328214550.GA3771@xxxxxxxxxxxxx> <5154E6AC.9020402@xxxxxxxxxxxxxxxxx>
Thank you for your responses.  Since this list is for XFS, I do not wish to go off topic too far into VM's.  But I will provide more context.

A key factor is the need for >2TB file systems that can be snapshot and reverted quickly.  We have other FC arrays attached to compute nodes without this requirement, and they have XFS directly on the FC logical volumes made accessible to native nodes and VM nodes via RDM.

Our FC arrays do not have native snapshot features, so we must use a software layer whether that is Linux LVM, ESXi, or something else.  And because of our unique usage patterns and constraints, we have settled on VMware over other virtualization technologies.  We are using ESXi (free version) but can upgrade to ESX if necessary.  However, the upgrade wouldn't fix the 2TB snapshot limit.

We are certainly not in the true HPC realm, but we do have about 20 physical compute nodes that do both random and sequential I/O.  An example query might identify a 10-500GB data set comprised of 100-500KB files.  Some work sets are processor bound with disk I/O accounting for less than 5%.  However, others are spending about 50% on disk I/O, so improving performance would be helpful - again in the context of the snapshot requirement.

Point well understood about the risks of striping multiple 2TB VMDK files together.  But because of the constraints, it's either 2TB VMDK's or 2TB RDM's in virtual compatibility mode, and they both seem about equally risky.  Do you have better suggestions?

Back to XFS, in this context, is there any benefit in tuning some parameters to get better performance, or will it all just be overshadowed by poor performance of the VMDKs that tuning isn't worthwhile?

Jan.


On Thu, Mar 28, 2013 at 8:56 PM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
On 3/28/2013 4:45 PM, Ralf Gross wrote:
> Stan Hoeppner schrieb:

> Snapshots are possible with RDM in virtual compatibily mode, not
> physical mode (> 2 TB).

So 2TB is the kicker here.  I haven't used ESX since 3.x, and none of
our RDMs back then were close to 2TB.  IIRC our largest was 500GB.

>> VMFS volumes are not intended for high performance IO.  Unless things
>> have changed recently, VMware has always recommended housing only OS
>> images and the like in VMDKs, not user data.  They've always recommended
>> using RDMs for everything else.  IIRC VMDKs have a huge block (sector)
>> size, something like 1MB.  That's going to make XFS alignment difficult,
>> if not impossible.
>
> I can't remember that I've every found this recommendation on a vmware
> page.
>
> http://blogs.vmware.com/vsphere/2013/01/vsphere-5-1-vmdk-versus-rdm.html

If you drill down through that you find this:
http://www.vmware.com/files/pdf/performance_char_vmfs_rdm.pdf

RDMs have better large sequential performance, and lower CPU burn than
VMDKs.  The OP mentioned "compute node" in his post, which suggests an
HPC application workload, which suggests large sequential IO.

Also note that VMware is Microsoft centric so they always run their
tests using an MS Server guest.  Also note they always test with tiny
volumes, in this case 20GB.  NTFS isn't going to have any trouble at
this size, but at say 20TB it probably will and these published results
would likely be quite different at that scale.  XFS performance
characteristics on a 2TB or 20TB or ?? TB volume will likely be
substantially different than NTFS.  Their tests show 5-8% lower CPU burn
for RDM vs VMDK.  Not a huge difference, but again they're testing only
20GB.

>> I cannot stress emphatically enough that you should not stitch 2TB VMDKs
>> together and use them in the manner you described.  This is a recipe for
>> disaster.  Find another solution.
>
> I'm seeing more and more requests for VMs with large disks lately in my
> env. Right now the max. is ~2 TB. I'm also thinking about where to go,
>  > 2 TB ist only possible with pRDMs which can't be snapshotted. You
> have to use the snapshot features of your storage array.

And more and more folks are using midrange FC/iSCSI arrays that don't
have snapshot features, others are using DAS with RAID HBAs, in both
cases forcing them to rely on ESX snapshots.  Sounds like VMware needs
to bump this artificial 2TB limit quite a bit higher.

--
Stan

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>