[Top] [All Lists]

Re: Question regarding XFS on LVM over hardware RAID.

To: stan <stan@xxxxxxxxxxxxxxxxx>
Subject: Re: Question regarding XFS on LVM over hardware RAID.
From: "C. Morgan Hamill" <chamill@xxxxxxxxxxxx>
Date: Mon, 03 Feb 2014 11:07:34 -0500
Cc: xfs <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wesleyan.edu; s=feb2013.wesmsa; t=1391443655; bh=yOcqWYzJEvgq4/Dwd26KIAXee50B/supRZ6Kqg0C0sQ=; h=From:To:Cc:In-reply-to:Subject:References:Date; b=YjtYmj1BYBaWS9SWP9AwN/L8jq0YUCdS47gI+c78Npc4c8qsTPNfHzgJpTTSFsz+/ 0j2pp9/g0+QI6YdThiptpA3WITV5jFK+oJYRgCxoQmspXhfTakYkU7G/Ms2e8m3VZo Z33O9ZRT1MyqtXKaKTHIdevcHj4LgomwDyn8BpC0=
In-reply-to: <52ED61C9.8060504@xxxxxxxxxxxxxxxxx>
References: <1391005406-sup-1881@xxxxxxxxxxxxxxx> <52E91923.4070706@xxxxxxxxxxx> <1391022066-sup-5863@xxxxxxxxxxxxxxx> <52E99504.4030902@xxxxxxxxxxxxxxxxx> <1391090527-sup-4664@xxxxxxxxxxxxxxx> <20140130202819.GO2212@dastard> <52EB3B96.7000103@xxxxxxxxxxxxxxxxx> <1391202273-sup-9265@xxxxxxxxxxxxxxx> <52ED61C9.8060504@xxxxxxxxxxxxxxxxx>
User-agent: Sup/git
Excerpts from Stan Hoeppner's message of 2014-02-01 16:06:17 -0500:
> Yes, that's one of the beauties of LVM.  However, there are other
> reasons you may not want to do this.  For example, if you have allocated
> space from two different JBOD or SAN units to a single LVM volume, and
> you lack multipath connections, if you have a cable, switch, HBA, or
> other failure disconnecting one LUN that will wreak havoc on your
> mounted XFS filesystem.  If you have multipath and the storage device
> disappears due to some other failure such as backplane,  UPS, etc, you
> have the same problem.

Very true; I gather this would only take out any volumes which at least
partially rest on the failed device, however?  As in, I don't lose the
whole volume group, correct?

> This isn't a deal breaker.  There are many large XFS filesystems in
> production that span multiple storage arrays.  You just need to be
> mindful of your architecture at all times, and it needs to be
> documented.  Scenario:  XFS unmounts due to an IO error.  You're not yet
> aware an entire chassis is offline.  You can't remount the filesystem so
> you start a destructive xfs_repair thinking that will fix the problem.
> Doing so will wreck your filesystem and you'll likely lose access to all
> the files on the offline chassis, with no ability to get it back short
> of some magic and a full restore from tape or D2D backup server.  We had
> a case similar to this reported a couple of years ago.

Oh God, that sounds terrible.  My sysadmininess is wondering why the
chassis wasn't monitored, but hindsight, etc. etc. ;-)

> If the logical sector size reported by your RAID controller is 512
> bytes, then "--dataalignment=9216s" should start your data section on a
> RAID60 stripe boundary after the metadata section.

I see that 9216s == 2608k/512b, but I'm missing something: is the
default metadata size guaranteed to be less than a single stripe, or is
there more to it?

Oh, wait, I think I just got it: '--dataalignment' will take care to
start on some multiple of 9216 sectors, regardless of the size of the
metadata section. Doy.

> The PhysicalExtentSize should probably also match the 4608KB stripe
> width, but this is apparently not possible.  PhysicalExtentSize must be
> a power of 2 value.  I don't know if or how this will affect XFS aligned
> write out.  You'll need to consult with someone more knowledgeable of LVM.

Makes sense.  If it would have an impact, then I'd probably just end up
going with RAID 0 on top of 2 or 4 RAID 6 groups, which looks like the
math would work out there.

> You bet.

Honestly, this is the most helpful and straightforward I've ever found
any project's mailing list, so kudos++.
Morgan Hamill

<Prev in Thread] Current Thread [Next in Thread>