[Top] [All Lists]

Re: External log size limitations

To: Andrew Klaassen <ak@xxxxxxxxxxx>
Subject: Re: External log size limitations
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 17 Feb 2011 11:32:33 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4D5C1D77.1060000@xxxxxxxxxxx>
References: <4D5C1D77.1060000@xxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Feb 16, 2011 at 01:54:47PM -0500, Andrew Klaassen wrote:
> Hi all,
> I found some this document:
> http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=bks&fname=/SGI_Admin/LX_XFS_AG/ch02.html
> ...which says that an XFS external log is limited to 128MB.
> Is there any way to make that larger?

The limit is just under 2GB now - that document is a couple of years
out of date - so if you are running on anything more recent that a
~2.6.27 kernel 2GB logs should work fine.

> Goal:  I'd love to try putting the external log on an SSD that could
> sustain two or three minutes of steady full-throttle writing.  128MB
> gives me less than a second worth of writes before my write speed
> slows down to the underlying storage speed.

Data write speed or metadata write speed? What sort of write
patterns? Also, don't forget that data is not logged so increasing
the log size won't change the speed of data writeback.

As it is, 2GB is still not enough for preventing metadata writeback
for minutes if that is what you are trying to do.  Even if you use
the new delaylog mount option - which reduces log traffic by an
order of magnitude for most non-synchronous workloads - log write
rates can be upwards of 30MB/s under concurrent metadata intensive

> I'll do lots of benchmarks before rolling it out to make sure it
> actually does help, of course.  I just want to know if it's
> possible, and how to do it if it is.

If you want a log larger than 2GB, then there is a lot of code
changes in both kernel an userspace as the log arithmetic is all
done via 32 bit integers and a lot of it is byte based.

As it is, there are significant scaling issues with logs of even 2GB
in size - log replay can take tens of minutes when a log full of
inode changes have to be replayed, filling a 2GB log means you'll
probably have ten of gigabytes of dirty metadata in memory, so
response to memory shortages can cause IO storms and severe
interactivity problems, etc.

In general, I'm finding that a log size of around 512MB w/ delaylog
gives the best tradeoff between scalability, performance, memory
usage and relatively sane recovery times...


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>