Michael Nishimoto wrote:
I've just finished analyzing an xfs filesystem which won't recover.
An inconsistent log record has 332 log operations but the num_logop field
in the record header says 333 log operations. The result is that xfs
recovery
complains with "bad clientid" because recovery eventually attempts to
decode
garbage.
The log record really has 332 log ops (I counted!).
Looking through xlog_write(), I don't see any way that record_cnt can be
bumped
without also writing out a log operation.
Does this issue ring a bell with anyone?
Michael
Having a bit of a look at other bugs than the snapshot one...
nothing really helpful.
I've seen a few "bad clientid" but that, as you say, just reflects that
at some point we have crap in the log op header which we
notice when doing recovery.
I had one (pv#945899) where it seemed to have got the head of the log wrong -
you could see using "xfs_logprint -d" at the change of cycle#s - it didn't
match.
Yours appears different.
I also had another one (pv#971596) but I didn't narrow it down to the
wrong# of log ops but maybe I wasn't looking carefully enough at the time.
Okay, for that one there were 2 bugs in one, one for bad clientid and
one for bad transaction - for the bad transaction,
there was something like a 2nd startop without an intervening commit op
for the tid - I moved onto something else before getting anywhere further.
--Tim
|