On 5/2/2011 10:47 AM, Paul Anderson wrote:
md apparently does not support barriers, so we are badly exposed in
that manner, I know. As a test, I disabled write cache on all drives,
performance dropped by 30% or so, but since md is apparently the
problem, barriers still didn't work.
Ideally, I'd firstly be able to find informed opinions about how I can
improve this arrangement - we are mildly flexible on RAID controllers,
I'm not familiar enough with the md driver to address the barrier issue.
Try the mdadm mailing list. However...
You should be able to solve the barrier issue, and get additional
advantages, by simply swapping out the LSI 9200-8E's with the 9285-8E
w/cache battery. The 9285 has a dual core 800MHz PowerPC (vs single
core 533MHz on the 9280) and 1GB of cache. Configure 3x15 drive
hardware RAID6 arrays per controller, then stitch the resulting 9 arrays
together with mdraid or LVM striping or concatenation. I'd test both
under your normal multistreaming workload to see which works best.
A multilevel stripe will show better performance with an artificial
single stream test such as dd, but under your operational multiple
stream workload, concatenation may have similar performance, while at
the same time giving you additional capability, especially if done with
LVM instead of mdraid --linear. Using LVM concatenation enables
snapshots and the ability to grow and shrink the volume, neither of
which you can do with striping (RAID 0).
The 9285-8E will be pricier than the 9280-8E but it's well worth the
extra dollars, given the low overall cost percentage of the HBAs vs
total system cost. You'll get better performance and the data safety
you're looking for. Just make sure that in addition to BBWC on the HBAs
you have good UPS units backing the servers and SC847 chassis.
very flexible on versions of Linux, etc, and can try other OS's as a
last resort (but the leading contender here would be "something"
running ZFS, and though I love ZFS, it really didn't seem to work well
for our needs).
Supermicro product is usually pretty decent. However, "DIY" arrays
comprised of an inexpensive teir 2/3 vendor drive box/backplane/expander
and off the shelf drives, whose firmware may not all match, can often be
a recipe for problems that are difficult to troubleshoot. Your problems
may not be caused by a kernel issue at all. The kernel may simply be
showing the symptoms but not the cause.
You've ordered, if my math is correct, 675 'enterprise class' 2TB SATA
drives, 45 per chassis, 135 per system, 5 systems. Did you
specify/verify with the vendor that all drives must be of the same
manufacturing lot and have matching firmware? When building huge
storage subsystems it is critical that all drives behave the same, which
usually means identical firmware.
Secondly, I welcome suggestions about which version of the linux
kernel you'd prefer to hear bug reports about, as well as what kinds
of output is most useful (we're getting all chassis set up with serial
console so we can do kgdb and also full kernel panic output results).
Others are better qualified to answer this. I'm just the lowly hardware
guy on the list. ;)