hi,
Just thought I'd send a note regarding the current state of
mkfs and repair, in light of recent changes & problems with
those changes, so that everyone knows where things stand...
- mkfs and repair both link with a static library which
knows alot about the internals of XFS, in particular
how to manipulate most of the on-disk data structures in
a manner very similar to the kernel XFS code (much of
the code is exactly the same in the kernel);
- this is either libsim (previously) or libxfs (currently)
- libxfs is very new (days) and has had some teething
problems with both mkfs and repair...
o first major problem was that the device zeroing code
was buggy - this was corrected over the weekend. this
code is used by both mkfs and repair (phase 2) to
initialize the log to a known state (all zeros);
o the bug caused only parts of the log to be zeroed
such that the head was OK initially, but after passing
a few records through the log, we'd eventually read
a not-zero'd part at a bad time (e.g. mount);
o became more confusing because the same buggy zeroing
code could be called from repair to partly-initialise
the log once more... seemingly "fixing" mkfs' mistake,
but not really (resetting it the same way mkfs did);
o *these log zeroing problems seem to be resolved now*
o the second problem is only evident in repair, since it
uses alot more of the library functionality than mkfs;
o this problem is going to be more difficult to correct
- it seems I've made a bad "optimisation" in the new
code which now does not play well with the transaction
mechanism in certain situations;
o I will have to pull in alot more transaction code in
order to resolve this one, and its gonna take a bit of
time (several days I imagine, & I'll clearly need to do
a bunch more testing too);
o as a result of this, I have switched off the code which
performs any writes in xfs_repair, so that filesystems
will not be further damaged by using it in the interim.
So, current state is: there are no known issues with mkfs,
and I'm working on fixing the write code in repair.
Note that it should still be possible to use the known-working
libsim mkfs/repair using the instructions I sent out earlier
(I have no idea what caused the core dump you saw, Thomas -
the sim code hasn't changed for a long time now & it works
fine for me... very strange).
cheers for now.
--
Nathan
|