On Tue, Mar 06, 2007 at 02:20:21PM -0800, J.H. wrote:
> Good afternoon,
>
> HPA mentioned that you guys might be able to give us some suggestions on
> implementing xfs on the kernel.org machines.
>
> Specifically I know was asked:
> <sandeen> what sort of storage is behind it?
> <sandeen> local disk, or raid, or ?
> <sandeen> knowing what's behind it (ide/scsi? geometry?
> battery-backed cache?) might help w/ the recommendations
>
> The storage that will be used is 300gb u320 10Krpm scsi drives, in total
> 21 drives (No obvious geometry is available but the model of the drives
> is a Compaq BD300884C2). These will be connected to the system via an
> HP Smart Array 6400 w/ 192mb of cache and a bbu (so we have
> battery-backed cache on the controller).
>
> Our current intention is to run hardware raid 6 across 3 subsets of the
> drives (due to controller limitations) which will give us three raid 6
> arrays of 7 drives each (total of 21 drives than).
Ok, so three volumes of about 1.5TB each, with a random write I/O
capability of about 200 iop/s, then. what chunk size are you using for
the hardware RAID?
Is the server running a 64 bit kernel?
> Beyond that - I'm
> open to suggestions. I had thought of doing software raid0 across the
> resulting three arrays and running xfs over that, though if there is a
> compelling reason to do things a different way I'm open to hearing about
> it.
Given that it's only a small number of volumes, a raid0 stripe across
this is probably the easiest way to get some level of load balancing
across the three volumes.
More importantly for performance, the write cache is battery backed
so you can safely use the "nobarrier" mount option to turn off write
barriers.
If you have a 64bit server, using the inode64 mount option would be
good to keep locality between inodes and their data as the filesystem
fills up.
Because it's a fairly metadata intensive filesystem, using version 2
logs with a stripe unit of <insert raid chunk size here> when making
the filesystem and using large log buffers (mount option
"logbsize=256k") will help make metadata ops go faster.
To top it all off, don't partition the luns, just use the whole raw
lun so that it is know that block zero of the filesystem aligns with
the first block of the hardware RAID lun. And of course you should
use the sunit/swidth mkfs options for data device alignment so that
XFS can align allocations to the underlying hardware correctly....
HTH.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
|