xfs
[Top] [All Lists]

Re: XFS on top of LVM span in AWS. Stripe or are AG's good enough?

To: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Subject: Re: XFS on top of LVM span in AWS. Stripe or are AG's good enough?
From: Jeff Gibson <jgibson@xxxxxxxxxxxxxxx>
Date: Tue, 16 Aug 2016 17:05:11 +0000
Accept-language: en-US
Authentication-results: spf=none (sender IP is ) smtp.mailfrom=jgibson@xxxxxxxxxxxxxxx;
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=spscommerce.onmicrosoft.com; s=selector1-spscommerce-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=hBw3hr3IvCtCxvoxHEpsns4/HIBCiPZCYlO+F9qTSrs=; b=iHyjb/wVvaDnckZUQobkAhP7UnITeUnIql9BI3RjlOUrgntzB9g2iR5Pfrobu/pD90NjVduFFvKhJIrf99gnUd5N5HcTU1uTBgxG/VF+YyMQJNUvqxN592Yn7pqQTVLhjDsQPXYpeLU65g0ZyrdpJlE47iv3aqL+FTjApuOZnBY=
In-reply-to: <20160816005931.GD19025@dastard>
References: <CY1PR04MB20100FE6C3039717BD27825AAC120@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>,<20160816005931.GD19025@dastard>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99
Thread-index: AQHR90Aa/Ibw35Nk+k2bm9/FclSkbKBKxGSAgADnOvg=
Thread-topic: XFS on top of LVM span in AWS. Stripe or are AG's good enough?
>On Mon, Aug 15, 2016 at 11:36:14PM +0000, Jeff Gibson wrote:
>> So I'm creating an LVM volume with 8 AWS EBS disks that are
>> spanned (linear) per Redhat's documentation for Gluster
>> (https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Deployment_Guide_for_Public_Cloud/>ch02s03.html#Provisioning_Storage_for_Three-way_Replication_Volumes).
>> 
>> 2 questions-
>> 
>> 1.  Will XFS's Allocation Groups essentially stripe the data for
>> me
>
>No. XFS does not stripe data. It does, however, *distribute* data
>different AGs according to locality policy (e.g. inode32 vs
>inode64), so it uses all the AGs as the directory structure grows.
Poor wording on my part.  By "essentially stripe" I mean distribute data 
throughout all of the EBS subvolumes instead of just using one EBS subvolume at 
a time until full.  I do plan on using inode64.

>> or should I stripe the underlying volumes with LVM?
>
>No, you're using EBS. Forget anything you know about storage layout
>and geometry, because EBS has no guaranteed physical layout you can
>optimise for.
Right.  However there could still be some gains from striping due to IOP limits 
for single volumes. - That is the combined IOPS for all the volumes striped 
together can be higher than they are for a single volume. 

>> I'm not
>> worried as much about data integrity with a stripe/span since
>> Gluster is doing the redundancy work.
>> 
>> 2.  AWS volumes sometimes have inconsistent performance.  If I
>> understand things correctly, AG's run in parallel.
>
>Define "run". AGs can allocate/free blocks in parallel.
By run I meant read/write data to/from the AGs.

>If IO does
>not require allocation, then AGs play no part in the IO path.
Can you explain this a bit please?  From my understanding data is written and 
read from space inside of AGs, so I don't see how it couldn't be part of the IO 
path.  Or do you simply mean reads just use inodes and don't care about the AGs?

>> In a
>> non-striped volume, if some of the AGs are temporarily slower to
>> respond than others due to one of the underlying volumes being
>> slow, will XFS prefer the quicker responding AGs
>
>No, it does not.
>
>> or is I/O always
>> evenly distributed?
>
>No, it is not.
>
>> If XFS prefers the more responsive AG's it
>> seems to me that it would be better NOT to stripe the underlying
>> disk since all AG's that are distributed in a stripe will
>> continuously hit all component volumes, including the slow volume
>> (unless if XFS compensates for this?)
>
>I think you have the wrong idea about what allocation groups do.
I'm reading the XFS File System Structure doc on xfs.org.  It says, "XFS 
filesystems are divided into a number of equally sized chunks called Allocation 
Groups. Each AG can almost be thought of as an individual filesystem." so 
that's where most of my assumptions are coming from.

>They are for maintaining allocation concurrency and locality of
>related objects on disk - they have no influence on where IO is
>directed based on IO load or response time.
I understand that XFS has locality as far as trying to write files to the same 
AG as the parent directory.  Are there other cases?
I get that it's probably not measuring the responsiveness of each AG. I guess 
what I'm trying to ask is - will XFS *indirectly* compensate if one subvolume 
is busier?  For example, if writes to a "slow" subvolume and resident AGs take 
longer to complete, will XFS tend to prefer to use other less-busy AGs more 
often (with the exception of locality) for writes?  What is the basic algorithm 
for determining where new data is written?  In load-balancer terms, does it 
round-robin, pick the least busy, etc?

Thank you very much!
JG
    
<Prev in Thread] Current Thread [Next in Thread>