xfs
[Top] [All Lists]

Re: mkfs options for a 16x hw raid5 and xfs (mostly large files)

To: "Ralf Gross" <Ralf-Lists@xxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx
Subject: Re: mkfs options for a 16x hw raid5 and xfs (mostly large files)
From: "Bryan J Smith" <b.j.smith@xxxxxxxx>
Date: Tue, 25 Sep 2007 14:08:52 +0000
Importance: Normal
In-reply-to: <20070925134955.GB20499@xxxxxxxxxxxxxxxxxxxxxxxxx>
References: <20070923093841.GH19983@xxxxxxxxxxxxxxxxxxxxxxxxx> <20070924173155.GI19983@xxxxxxxxxxxxxxxxxxxxxxxxx> <Pine.LNX.4.64.0709241400370.12025@xxxxxxxxxxxxxxxx> <20070924203958.GA4082@xxxxxxxxxxxxxxxxxxxxxxxxx> <Pine.LNX.4.64.0709241642110.19847@xxxxxxxxxxxxxxxx> <20070924213358.GB4082@xxxxxxxxxxxxxxxxxxxxxxxxx> <Pine.LNX.4.64.0709241736370.19847@xxxxxxxxxxxxxxxx> <20070924215223.GC4082@xxxxxxxxxxxxxxxxxxxxxxxxx> <20070925123501.GA20499@xxxxxxxxxxxxxxxxxxxxxxxxx> <20070925125733.GA20873@xxxxxxxxxxxx><20070925134955.GB20499@xxxxxxxxxxxxxxxxxxxxxxxxx>
Reply-to: b.j.smith@xxxxxxxx
Sender: xfs-bounce@xxxxxxxxxxx
Sensitivity: Normal
Use multiple cards on multiple PCI-X/PCIe channels, each with
their own RAID-5 (or 6) volume, and then stripe (OS LVM) RAID-0
across the volumes.

Depending on your network service and application, you can use
either hardware or software for the RAID-5 (or 6).
If it's heavily read-only servicing, then software RAID works great,
because it's essentially RAID-0 (minus 1 disc).
But always use the OS RAID (e.g., LVM stripe) to stripe RAID-0
across all volumes, assuming there is not an OS volume limit
(of course ;).

Software RAID is extemely fast at XORs, that's not the problem.
The problem is how the data stream through the PC's inefficient
I/O interconnect. PC's have gotten much better, but the load still
detracts from other I/O, that services may contend with.

Software RAID-5 writes are, essentially, "programmed I/O."
Every single commit has to have it's parity blocked programmed
by the CPU, which is difficult to bechmark because the
bottleneck is not the CPU, but the LOAD-XOR-STOR of the interconnect.

An IOP is designed with ASIC peripherals to do that in-line, real-time.
In fact, by the very nature of the IOP driver, the operation is synchronous
to the OS' standpoint, unlike software RAID optimizations by the OS.


--  
Bryan J Smith - mailto:b.j.smith@xxxxxxxx  
http://thebs413.blogspot.com  
Sent via BlackBerry from T-Mobile  
    

-----Original Message-----
From: Ralf Gross <Ralf-Lists@xxxxxxxxxxxx>

Date: Tue, 25 Sep 2007 15:49:56 
To:linux-xfs@xxxxxxxxxxx
Subject: Re: mkfs options for a 16x hw raid5 and xfs (mostly large files)


KELEMEN Peter schrieb:
> * Ralf Gross (ralf-lists@xxxxxxxxxxxx) [20070925 14:35]:
> 
> > There is a second RAID device attached to the server (24x
> > RAID5). The numbers I get from this device are a bit worse than
> > the 16x RAID 5 numbers (150MB/s read with dd).
> 
> You are expecting 24 spindles to align up when you have a write
> request, which has to be 23*chunksize bytes in order to avoid RMW.
> Additionally, your array is so big that you're very likely to hit
> another error while rebuilding.  Chop up your monster RAID5 array
> into smaller arrays and stripe across them.  Even better, consider
> RAID10.

RAID10 is no option, we need 60+ TB at the moment, mostly large video
files. Basically the read/write performance we get with the 16x RAID 5
is sufficient for our needs. The 24x RAID 5 is only a test device. The
volumes that will be used in the future are the 16/15x RAIDs (48 disk
shelf with 3 volumes).

I'm just wondering how people get 400+ MB/s with HW-RAID 5.

Ralf




<Prev in Thread] Current Thread [Next in Thread>