On Wed, 2003-11-19 at 20:34, Laurent Imbaud wrote:
> Hello,
>
> First, thank you for your excellent software and documentation.
>
> I have poured over the white paper, the docs, read the man pages several
> times and I think I understand well enough what the issues and
> possibilities are.
>
> But I see nowhere a single way of moving the log, say to an external
> device if disk seek is suspected to be a performance issue, short of
> backing up the data, doing a new mkfs.xfs and copying back.
>
> I tried using the logdev option of mount in the hope that the location
> could be moved at that point but that was not accepted, given the xfs
> was created with an internal log.
>
> On a related issue, some idea of how big the log should be would be
> worth documenting more precisely. mkfs.xfs complained the logdev
> partition was too big in one manipulation, stating a maximum of 12k
> blocks (forgive me if this is not entirely accurate, I did not write
> down the exact message).
>
> It would seem simple, convenient and desirable for administrators to
> move the location of the log at anytime, even, while the system is
> mounted. Solid state disks come cheap nowadays and would make great
> log devices, especially if one could point and xfs to log there after it
> has been in production for a while.
>
> I would appreciate very much hearing what you think. Hopefully, I have
> simply missed an obvious thing somewhere.
>
> Thank you.
>
No, you did not miss the obvious, getting from an internal to
an external log and back again is possible, but is somewhat of
an arcane art form.
First, the XFS superblock contains two bits of information
about the log, its block offset and its size. It does not
indicate which device it is on.
Second, the log does contain some format information which
must match up for a log to be recognized as such.
Third, xfs_repair can overwrite a log and reformat it to the
same state as mkfs would have created.
Four, there is a maximum size of log, 32768 filesystem blocks
I think - I am writing this without looking at the code...
So, when a filesystem is made with the default internal log,
the space is carved off one end of an allocation group near
the middle of the filesystem. This space is always going to be
log space, there is no code to free the space and turn it back
into just plain old disk space.
So, getting from the internal log to an eternal one:
First you should make sure you cleanly unmounted the filesystem.
Run xfs_db -r /dev/xxx, inside db do
> sb
> p
this will dump the superblock out, there are two fields in there
which represent the start and length of the log in filesystem
blocks. Save these numbers. logblocks is the size and logstart
is the start offset.
Now pick your external log device, choose a starting offset (mkfs
for external logs always picks zero), for a solid state device
there is no reason not to put several on a single device, but
it is probably simplest to partition the device and use small
partitions for the logs. The latter would make it easier to
keep track of things.
Next you need to update the superblock with the new length
and offset for the log.
xfs_db -x /dev/xxx
sb 0
p
write logstart XXX
write logblocks YYY
The -x gives you write access.
Now exit xfs_db and run this:
xfs_repair -L -l /dev/logdev /dev/xxx
This will assume /dev/logdev is where the log lives and initialize
it, issuing a stern warning that you are trashing log data.
Now mount the filesystem with mount -o logdev=/dev/logdev
If I remembered all of that correctly, you should be up and
running with an external log.
Now for the big fat warning:
PRACTICE THIS ON A DEVICE YOU DO NOT CARE ABOUT FIRST!!!
You cannot grow the internal log despite what the growfs
man page might tell you. You could in theory use a variation
on the above to shrink it.
You can get back to the internal log by reinserting the old
log values, running repair without the -l option and then
mounting without the external log option.
As for sizing the log, it always seems to be like asking how
long a piece of string is.
A bigger log means you can keep more metadata in flight before
you have to wait for metadata to be flushed before you can write
more log records. We refer to this mode of operation as tail
pushing - think of the log as being a small train on a circular
toy train track. When you are tail pushing, the you have added
carriages until the engine is pushing against the caboose (I have
a three year old, can you tell ;-)
In more technical terms, the amount of metadata update you want
to be able to deal with in a burst without the slowdown in
throughput caused by tail pushing is the governing factor here.
In reality, you can always get into tail pushing mode if you
give the fs enough work to do, but for non sustained metadata
ops, a larger log should help throughput.
The downside of this is recovery time during mount, the larger
the log, the longer mount will take. So if you want really
fast recovery after a crash, you want a small log.
Phew, thats probably more answer than you were expecting.
Steve
|