xfs
[Top] [All Lists]

Re: Performance problem - reads slower than writes

To: xfs@xxxxxxxxxxx
Subject: Re: Performance problem - reads slower than writes
From: Joe Landman <landman@xxxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 04 Feb 2012 15:44:25 -0500
In-reply-to: <20120204200417.GA3362@xxxxxxxx>
Organization: Scalable Informatics
References: <20120130220019.GA45782@xxxxxxxx> <20120131020508.GF9090@dastard> <20120131103126.GA46170@xxxxxxxx> <20120131145205.GA6607@xxxxxxxxxxxxx> <20120203115434.GA649@xxxxxxxx> <4F2C38BE.2010002@xxxxxxxxxxxxxxxxx> <20120203221015.GA2675@xxxxxxxx> <4F2D016C.9020406@xxxxxxxxxxxxxxxxx> <20120204112436.GA3167@xxxxxxxx> <4F2D2953.2020906@xxxxxxxxxxxxxxxxx> <20120204200417.GA3362@xxxxxxxx>
Reply-to: landman@xxxxxxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:9.0) Gecko/20111229 Thunderbird/9.0
On 02/04/2012 03:04 PM, Brian Candler wrote:
On Sat, Feb 04, 2012 at 06:49:23AM -0600, Stan Hoeppner wrote:

[...]

Sure it can. A gluster volume consists of "bricks". Each brick is served by
a glusterd process listening on a different TCP port. Those bricks can be on
the same server or on different servers.

I seem to remember that the Gluster folks abandoned this model (using their code versus MD raid) on single servers due to performance issues. We did play with this a few times, and the performance wasn't that good. Basically limited by single disk seek/write speed.


Even if what you describe can be done with Gluster, the performance will
likely be significantly less than a properly setup mdraid or hardware
raid.  Again, if it can be done, I'd test it head-to-head against RAID.

I'd expect similar throughput but higher latency. Given that I'm using low

My recollection is that this wasn't the case. Performance was suboptimal in all cases we tried.

RPM drives which already have high latency, I'm hoping the additional
latency will be insignificant.  Anyway, I'll know more once I've done the
measurements.

I did this with the 3.0.x and the 2.x series of Gluster. Usually atop xfs of some flavor.


I've never been a fan of parity RAID, let alone double parity RAID.

I'm with you on that one.

RAID's entire purpose in life is to give an administrator time to run in and change a disk. RAID isn't a backup, or even a guarantee of data retention. Many do treat it this way though, to their (and their data's) peril.

The attractions of gluster are:
- being able to scale a volume across many nodes, transparently to
   the clients

This does work, though rebalance is as much a function of the seek and bandwidth of the slowest link as other things. So if you have 20 drives, and you do a rebalance to add 5 more, its gonna be slow for a while.

- being able to take a whole node out of service, while clients
   automatically flip over to the other


I hate to put it like this, but this is true for various definitions of the word "automatically". You need to make sure that your definitions line up with the reality of "automatic".

If a brick goes away, and you have a file on this brick you want to overwrite, it doesn't (unless you have a mirror) flip over to another unit "automatically" or otherwise.

RAID in this case can protect you from some of these issues (single disk failure issues, being replaced by RAID issues), but unless you are building mirror pairs of bricks on separate units, this magical "automatic" isn't quite so.

Moreover, last I checked, Gluster made no guarantees as to the ordering of the layout for mirrors. So if you have more than one brick per node, and build mirror pairs with the "replicate" option, you have to check the actual hashing to make sure it did what you expect. Or build up the mirror pairs more carefully.

At this point, it sounds like there is a gluster side of this discussion that I'd recommend you take to the gluster list. There is an xfs portion as well which is fine here.

Disclosure: we build/sell/support gluster (and other) based systems atop xfs based RAID units (both hardware and software RAID; 1,10,6,60,...) so we have inherent biases. Your mileage may vary. See your doctor if your re-balance exceeds 4 hours.

Regards,

Brian.

Joe

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@xxxxxxxxxxxxxxxxxxxxxxx
web  : http://scalableinformatics.com
       http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615

<Prev in Thread] Current Thread [Next in Thread>