On Wed, 2003-06-11 at 16:28, Andi Kleen wrote:
> On Wed, Jun 11, 2003 at 03:33:14PM -0500, Steve Lord wrote:
> > On Wed, 2003-06-11 at 04:35, Andi Kleen wrote:
> > > A long standing problem in XFS is that in the default configuration
> > > metadata performance is not that great because it uses not enough
> > > log buffers. There are FAQs around to fix this, but it would be better
> > > if the kernel did the right thing by default.
> > >
> > > The main problem is probably that XFS still uses the default from the
> > > early 90ies, which are probably not that good anymore for today's
> > > machines.
> > >
> > > This patch changes the logbufs= default based on the available memory.
> > > If you have 128MB or less it uses 3 logbufs (normally 96K per file
> > > system)
> > > For 400MB or less it uses 5 (160K)
> > > For anything bigger 8 (256K)
> > Hi Andi,
> > Just wondering why you picked odd numbers?
> Usual handwaving. Of course i should have picked power of two just to
> make it look more scientific, but the range was a bit too small for that ;)
> 3 was the old default, which seems ok for small systems. Small system
> is arbitarily defined as <= 128MB memory.
2 was the default, but OK.
> 8 is the current maximum (is there a reason for that btw? could it be simply
It could be.
> I wanted to get 8 on my 512MB test box. And usually the memory counting
> variable has some loss, so I choose 400MB as the boundary.
> 5 is somewhere between 3 and 8 for the boxes inbetween.
> > >
> > > It is still a kind of bandaid. I think the better solution would be to
> > > dynamically allocate new log buffers as needed until some limit
> > > (and block if the memory cannot be allocated). This should not be that
> > > bad because vmalloc/vfree are not that expensive anymore and with some
> > > luck you can even get a physically continuous buffer (e.g. on a 16byte
> > > page size
> of course
> > > ia64 system)
> > Interesting idea, one issue is that during recovery, the maximum amount
> > of outstanding I/O there might of been (i.e. number of iclog buffers)
> > is a factor in how much work there is to do. Adding new ones dynamically
> Hmm, I thought that recovery work was bounded by the on disk log size? How do
> the pending buffers come into play? They look more likely to make
> you lose a bit more data in case of a crash, but your new sync daemon
> with a timer should take care of that (it will still be much better
> with this than it was ever before)
> Processing 10MB (new minimum with the mkfs patch for 1GB+) or even 64MB
> at mount shouldn't be a big issue on a modern box.
Without going and looking at the code again, I think the issue is that
we have to do an extra scan over MAX_ICLOG_SIZE*MAX_ICLOG_NUM log space
looking at each 512 byte header and ensuring that it all made it out to
disk. There is also code to zero forwards from the head of the log
found during recovery to write over any out of order writes which might
have made it out to disk during the last mount. If the zeroing is not
done then it is possible that we could crash again, and during recovery
this time recognize that chunk of log from mount -2 and being from
mount -1 and replay it incorrectly.
> > might be possible, but there is this 'interesting' state machine on the
> > log buffers to deal with there.
> I have not really looked into the state machine yet. You are saying
> it has some scalability problems with more buffers or are the data structures
> just nasty enough to adding more buffers dynamically could be difficult?
Not scalability, the log buffers are really just a pipeline between
transactions and the on disk log, making it longer is always possible.
Its just a matter of working out where it is safe to insert one (and
delete one for that matter) whilst the state machine is active.
Take a look at xlog_state_do_callback, not as bad as xfs mkfs ;-)
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: lord@xxxxxxx