On Tue, Feb 25, 2014 at 06:15:25PM -0800, Christoph Hellwig wrote:
> Interesting. I had a prototype of this supporting just a single
> region a while ago, but never managed to get it to pass all recovery
> tests. That was before I separated the in-core from the on-disk
> log items, though.
Yeah, that makes it much easier to deal with. It's still a bit
messy though, and it still panics every so often under heavy load.
IOWs I still don't quite have the range-to-bitmap accounting
> > Hence if we just track a signle region, it will almost always cover
> > the entire directory buffer - if we only modify a single entry in
> > the buffer, then that's a fairly large cost in terms of log space
> > and CPU overhead for random individual operations. If we decide that
> > we are going to use a single range, then we may as well just use the
> > dirty flag and log the entire buffer every time.
> Which might not be an all that bad idea given how much log bandwith
> we have available. Defintively would be interesting to instrument
> and benchmark it vs the 4 regions version.
Well, for v4 filesystems under create workloads the increase is
definitely noticable - I haven't got measurements to hand, though it
was somethign in the order 30% or so for 256 byte inodes due to the
increase in size of the inode cluster buffers being logged. For
larger inodes that's going to be even worse. We don't have to worry
on v5 filesystems, though, so only btree blocks are the main
> Note that we probably should also introduce a log incompat feature to
> just log the range instead of converting it to the old bitmap for v5
Yup, that's easy enough to do, and should make the v5 code even
simpler and faster....