Help with XFS in VMs on VMFS

Jan Perci jperci at gmail.com
Thu Mar 28 22:30:01 CDT 2013


Thank you for your responses.  Since this list is for XFS, I do not wish to
go off topic too far into VM's.  But I will provide more context.

A key factor is the need for >2TB file systems that can be snapshot and
reverted quickly.  We have other FC arrays attached to compute nodes
without this requirement, and they have XFS directly on the FC logical
volumes made accessible to native nodes and VM nodes via RDM.

Our FC arrays do not have native snapshot features, so we must use a
software layer whether that is Linux LVM, ESXi, or something else.  And
because of our unique usage patterns and constraints, we have settled on
VMware over other virtualization technologies.  We are using ESXi (free
version) but can upgrade to ESX if necessary.  However, the upgrade
wouldn't fix the 2TB snapshot limit.

We are certainly not in the true HPC realm, but we do have about 20
physical compute nodes that do both random and sequential I/O.  An example
query might identify a 10-500GB data set comprised of 100-500KB files.
 Some work sets are processor bound with disk I/O accounting for less than
5%.  However, others are spending about 50% on disk I/O, so improving
performance would be helpful - again in the context of the snapshot
requirement.

Point well understood about the risks of striping multiple 2TB VMDK files
together.  But because of the constraints, it's either 2TB VMDK's or 2TB
RDM's in virtual compatibility mode, and they both seem about equally
risky.  Do you have better suggestions?

Back to XFS, in this context, is there any benefit in tuning some
parameters to get better performance, or will it all just be overshadowed
by poor performance of the VMDKs that tuning isn't worthwhile?

Jan.


On Thu, Mar 28, 2013 at 8:56 PM, Stan Hoeppner <stan at hardwarefreak.com>wrote:

> On 3/28/2013 4:45 PM, Ralf Gross wrote:
> > Stan Hoeppner schrieb:
>
> > Snapshots are possible with RDM in virtual compatibily mode, not
> > physical mode (> 2 TB).
>
> So 2TB is the kicker here.  I haven't used ESX since 3.x, and none of
> our RDMs back then were close to 2TB.  IIRC our largest was 500GB.
>
> >> VMFS volumes are not intended for high performance IO.  Unless things
> >> have changed recently, VMware has always recommended housing only OS
> >> images and the like in VMDKs, not user data.  They've always recommended
> >> using RDMs for everything else.  IIRC VMDKs have a huge block (sector)
> >> size, something like 1MB.  That's going to make XFS alignment difficult,
> >> if not impossible.
> >
> > I can't remember that I've every found this recommendation on a vmware
> > page.
> >
> > http://blogs.vmware.com/vsphere/2013/01/vsphere-5-1-vmdk-versus-rdm.html
>
> If you drill down through that you find this:
> http://www.vmware.com/files/pdf/performance_char_vmfs_rdm.pdf
>
> RDMs have better large sequential performance, and lower CPU burn than
> VMDKs.  The OP mentioned "compute node" in his post, which suggests an
> HPC application workload, which suggests large sequential IO.
>
> Also note that VMware is Microsoft centric so they always run their
> tests using an MS Server guest.  Also note they always test with tiny
> volumes, in this case 20GB.  NTFS isn't going to have any trouble at
> this size, but at say 20TB it probably will and these published results
> would likely be quite different at that scale.  XFS performance
> characteristics on a 2TB or 20TB or ?? TB volume will likely be
> substantially different than NTFS.  Their tests show 5-8% lower CPU burn
> for RDM vs VMDK.  Not a huge difference, but again they're testing only
> 20GB.
>
> >> I cannot stress emphatically enough that you should not stitch 2TB VMDKs
> >> together and use them in the manner you described.  This is a recipe for
> >> disaster.  Find another solution.
> >
> > I'm seeing more and more requests for VMs with large disks lately in my
> > env. Right now the max. is ~2 TB. I'm also thinking about where to go,
> >  > 2 TB ist only possible with pRDMs which can't be snapshotted. You
> > have to use the snapshot features of your storage array.
>
> And more and more folks are using midrange FC/iSCSI arrays that don't
> have snapshot features, others are using DAS with RAID HBAs, in both
> cases forcing them to rely on ESX snapshots.  Sounds like VMware needs
> to bump this artificial 2TB limit quite a bit higher.
>
> --
> Stan
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130328/da283250/attachment-0001.html>


More information about the xfs mailing list