xfs
[Top] [All Lists]

Re: xfsrestore fails: assertion failure in do_next_mark

To: unruh@xxxxxxx
Subject: Re: xfsrestore fails: assertion failure in do_next_mark
From: Timothy Shimmin <tes@xxxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 22 Feb 2002 16:51:15 +1100
Cc: Simon Matter <simon.matter@xxxxxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx, hlady@xxxxxxxxxxx
In-reply-to: <XFMail.20020131195802.unruh@acm.org>; from unruh@acm.org on Thu, Jan 31, 2002 at 07:58:02PM +0100
References: <3C58FAB9.D940E9B8@ch.sauter-bc.com> <XFMail.20020131195802.unruh@acm.org>
Sender: owner-linux-xfs@xxxxxxxxxxx
Hi guys,

On Thu, Jan 31, 2002 at 07:58:02PM +0100, unruh@xxxxxxx wrote:
> On 31-Jan-2002 Simon Matter wrote:
> 
> > IIRC I got the same error some months ago when trying to restore from a
> > DDS2 DAT drive. On the same system restoring from DLT was okay. I didn't
> > try again but I guess this bug is still not fixed :-(
> 
> After some reading of the mailing list archives, I added the following at 
> line 1481 in drive_scsitape.c (version 1.5):
> 
>    rechdrp->first_mark_offset =
>          INT_GET(rechdrp->first_mark_offset,ARCH_CONVERT);
> 
> 
> The patched xfsrestore can read the damaged backup, 
> but will probably be unable
> to read backups from newer versions of xfsdump. 
> 
> 
Thanks for the suggestion Daniel.

Problem
-------
Yes this problem was fixed in xfsdump:
    xfsdump-1.1.10 (10 December 2001)
            - fix xfsdump to endian convert all of the record header
              fields properly just prior to writing the header out
              (in particular first_mark_offset).
              This caused do_next_mark() assertion failures at some
              sites.

The problem was that the dump record headers were endian converted
too early. The "first_mark_offset" field was then overwritten after
the conversion prior to being written out to tape.
The endian conversion is now (since xfsdump-1.1.10)
being done just prior to being written out.
So in old dumps, it means that "first_mark_offset" will not be
endian converted. And on restore it _will_ be endian converted
which will stuff it up. 
(I haven't looked into why this problem didn't occur for everyone)

Restoring old bad dumps
-----------------------
So for people with trouble restoring old dumps with the 
first_mark_offset assertion failure,
they need to hack xfsrestore to NOT endian convert 
first_mark_offset. 
(Daniel's method would endian convert twice I think and
achieve the same result - once in getrec() and then
again after that)

In drive_scsitape.c, tape records are read in by this call sequence...
getrec()->read_record()->record_hdr_validate()->xlate_rec_hdr()

So as an alternative to Daniel's suggestion, 
one could modify arch_xlate.c/xlate_rec_hdr()
and take out the first_mark_offset translation:
  IXLATE(rh1, rh2, first_mark_offset); 
One could test that the translation looks sane by
running xfsrestore with -v4 and looking for msgs of the form
"xlate_rec_hdr: pre-xlate\n" and "xlate_rec_hdr: post-xlate\n",
and check that what is coming from the tape looks sane and
nothing is changing after translation for this field.

(Note: the initial conversion of first_mark_offset in the 
 old xfsdump, was done on first_mark_offset's initial value
 of -1. And if first_mark_offset was never set after that
 then it would still have a value of -1.
 And -1 byte-swapped is -1.)
 
------------------------------

Jason - let me know how you go.

--Tim


<Prev in Thread] Current Thread [Next in Thread>
  • Re: xfsrestore fails: assertion failure in do_next_mark, Timothy Shimmin <=