[Top] [All Lists]

Re: allocsize mount option

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: allocsize mount option
From: Gim Leong Chin <chingimleong@xxxxxxxxxxxx>
Date: Fri, 15 Jan 2010 11:08:22 +0800 (SGT)
Cc: Eric Sandeen <sandeen@xxxxxxxxxxx>, xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.sg; s=s1024; t=1263524902; bh=xBu+lz9311P460PpmosOYhCGBNtEznqea32saWB8OKk=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding; b=kRGRZWc+wqBfto+fwMeQhkZi4GA0o2Yx0vEpb3Z17tTrJS+B87wUVRIIeyHADgXu0JptSYQJ73AMSWS+FG9t9sfIaES9V26uWIq5TRgkbjxeyGdhJNWtruAbDWC4O0RLqfYtEjGrn9yNuq9NN4cl/Fo6zPWKnM0G2/+dS1l+hYE=
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com.sg; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding; b=rJGBw7bYACNYrjqS7qcGrxL6jyH1Mu8fh+bB/a7LiIYW/EIezwqjm2yznyqSUa0KePAUeXBPgcVQWUA6LnBlkakIN6lcIyed0cc/cl4S1T6SkKlUODYM5tFHOve7GNyvJ1AqpCEgcL8m4dn0YONTud7vOTBOvDLnNEyFxp53Q4c=;
Hi Dave,

Thank you for the advice!

I have done Direct IO dd tests writing the same 20 GB files.  The results are 
an eye opener!  bs=1GB, count=2

Single instance repeats of 830, 800 MB/s, compared to >100 to under 300 MB/s 
for buffered.

Two instances aggregate of 304 MB/s, six instances aggregate of 587 MB/s.

System drive /home RAID 1 of 130 MB/s compared to 51 MB/s buffered.

So the problem is with the buffered writes.

> Youἀd have to get all the fixes from 2.6.30 to 2.6.32,
> and the
> backport would be very difficult to get right. Better
> would
> be طust to upgrade the kernel to 2.6.32 ;)

If I change the kernel, I would have no support from Novell.  I would try my 
luck and convince them.

> > > I'd suggest that you might need to look at
> increasing the
> > > maximum IO
> > > size for the block device
> > > (/sys/block/sdb/queue/max_sectors_kb),
> > > maybe the request queue depth as well to get
> larger IOs to
> > > be pushed
> > > to the raid controller. if you can, at least get
> it to the
> > > stripe
> > > width of 1536k....
> > 
> > Could you give a good reference for performance tuning
> of these
> > parameters?  I am at a total loss here.
> Welcome to the black art of storage subsystem tuning ;)
> I'm not sure there is a good reference for tuning the block
> device
> parameters - most of what I know was handed down by word of
> mouth
> from gurus on high mountains.
> The overriding principle, though, is to try to ensure that
> the
> stripe width sized IOs can be issued right through the IO
> stack to
> the hardware, and that those IOs are correctly aligned to
> the
> stripes. You've got the filesystem configuration and layout
> part
> correct, now it's just tuning the block layer to pass the
> IO's
> through.

Can I confirm that
(/sys/block/sdb/queue/max_sectors_kb)=stripe width 1536 kB

Which parameter is "request queue depth"?  What should be the value?

> FWIW, your tests are not timing how longit takes for all
> the
> data to hit the disk, only how long it takes to get into
> cache.

Thank you!  I do know that XFS buffers writes extensively.  The drive LEDs 
remain lighted long after the OS says the writes are completed.  Plus some 
timings are physically impossible.

> That sounds wrong - it sounds like NCQ is not functioning
> properly
> as with NCQ enabled, disabling the drive cache should not
> impact
> throughput at all....

I do not remember clearly if NCQ is available for that motherboard, it is an 
Ubuntu 32-bit, but I do remember seeing queue depth in the kernel.  I will 
check it out next week.

But what I read is that NCQ hurts single write performance.  That is also what 
I found with another Areca SATA RAID in Windows XP.

What I found with all the drives we tested was that disabling the cache badly 
hurt sequential write performance (no file system, write data directly to 
designated LBA).

> I'd suggest trying to find another distributor that will
> bring them
> in for you. Putting that many drives in a single chassis is
> almost
> certainly going to cause vibration problems, especially if
> you get
> all the disk heads moving in close synchronisation (which
> is what
> happens when you get all your IO sizing and alignment
> right).

I am working on changing to the WD Caviar RE4 drives.  Not sure if I can pull 
it off.

Chin Gim Leong

      New Email names for you! 
Get the Email name you&#39;ve always wanted on the new @ymail and @rocketmail. 
Hurry before someone else does!

<Prev in Thread] Current Thread [Next in Thread>