| To: | linux-xfs@xxxxxxxxxxx |
|---|---|
| Subject: | Problem restoring destroyed, important XFS filesystem from tape backup using xfsrestore |
| From: | Jason Hlady <hlady@xxxxxxxxxxx> |
| Date: | Wed, 26 Jan 2005 16:36:51 -0600 |
| Sender: | linux-xfs-bounce@xxxxxxxxxxx |
Hi there, I'm not even sure if this is the right mailing list for my problem, but I was hoping that someone here could at least point me to the correct set of people who might help me get the data back off the tapes. I have graduate students facing the loss of months and years of work, and I would appreciate any help that could be ferried my way. I have had a catastrophic RAID failure on a 700GB XFS filesystem (OS = Mandrake 9.1 on IBM x340) and am looking to try and restore from tape. I have had backups running on my Ultrium LTO tape drive. All of the dumps that we have done in the past two years seem to have this same restoring problem. We hadn't checked whether we could restore from the tapes recently, since we had successfully managed to dump and resture using xfsdump and xfsrestore years earlier :( . I apologize in advance for the length/verbosity of this email. I just wanted to provide some information in advance. The version of xfsdump used to create those dumps was xfsdump-2.0.3-1mdk. The command used to generate the xfsdump was /sbin/xfsdump -f /dev/tape -l 0 -F -m -b 245760 -o -L label -M label2 We believe that the data was successfully dumped onto the tapes, as we have logs of the data dumping process, i.e. 1.1 lode:/birl [Tape File: 1] 08:46:58 - 19:47:04 <summary> lode:/birl Level=0 TapeCart=2701 TapeFile=1 Completed=(Tue Jan 4 19:47:04 CST 20 05) <volstat> /sbin/xfsdump: using minimum scsi tape (drive_minrmt) strategy /sbin/xfsdump: version 3.0 - Running single-threaded /sbin/xfsdump: WARNING: most recent level 0 dump was interrupted, but not resuming that d ump since resume (-R) option not specified /sbin/xfsdump: dump date: Tue Jan 4 08:47:00 2005 /sbin/xfsdump: session id: a8ff599e-4741-4db6-82bb-999a716ddeee /sbin/xfsdump: session label: "2701" /sbin/xfsdump: ino map phase 1: skipping (no subtrees specified) /sbin/xfsdump: ino map phase 2: constructing initial dump list /sbin/xfsdump: ino map phase 3: skipping (no pruning necessary) /sbin/xfsdump: ino map phase 4: skipping (size estimated in phase 2) /sbin/xfsdump: ino map phase 5: skipping (only one dump stream) /sbin/xfsdump: ino map construction complete /sbin/xfsdump: estimated dump size: 524121412864 bytes /sbin/xfsdump: preparing drive /sbin/xfsdump: WARNING: media may contain data. Overwrite option specified /sbin/xfsdump: creating dump session media file 0 (media 0, file 0) /sbin/xfsdump: dumping ino map /sbin/xfsdump: dumping directories /sbin/xfsdump: dumping non-directory files /sbin/xfsdump: ending media file /sbin/xfsdump: media file size 319242240 bytes /sbin/xfsdump: creating dump session media file 1 (media 0, file 1) /sbin/xfsdump: dumping ino map /sbin/xfsdump: dumping directories /sbin/xfsdump: dumping non-directory files /sbin/xfsdump: ending media file /sbin/xfsdump: media file size 317521920 bytes /sbin/xfsdump: creating dump session media file 2 (media 0, file 2) /sbin/xfsdump: dumping ino map /sbin/xfsdump: dumping directories /sbin/xfsdump: dumping non-directory files /sbin/xfsdump: ending media file /sbin/xfsdump: media file size 327106560 bytes <SNIP of a bunch of media files> /sbin/xfsdump: creating dump session media file 473 (media 0, file 473) /sbin/xfsdump: dumping ino map /sbin/xfsdump: dumping directories /sbin/xfsdump: dumping non-directory files /sbin/xfsdump: ending media file /sbin/xfsdump: media file size 318996480 bytes /sbin/xfsdump: creating dump session media file 474 (media 0, file 474) /sbin/xfsdump: dumping ino map /sbin/xfsdump: dumping directories /sbin/xfsdump: dumping non-directory files /sbin/xfsdump: tape media error on write operation /sbin/xfsdump: no more data can be written to this tape /sbin/xfsdump: ending media file /sbin/xfsdump: media file size 79134720 bytes /sbin/xfsdump: WARNING: media change decline will be treated as a request to stop using d rive: can resume later /sbin/xfsdump: dump size (non-dir files) : 129800399672 bytes /sbin/xfsdump: NOTE: dump interrupted: 39604 seconds elapsed: may resume later using -R o ption /sbin/xfsdump: Dump Status: INTERRUPT When I try and restore from the tapes, however, using xfsrestore, I get significant errors. Specifically, [root@snowbank]xfsrestore -i -f /dev/nst0 . xfsrestore: using scsi tape (drive_scsitape) strategy xfsrestore: version 3.0 - Running single-threaded xfsrestore: searching media for dump xfsrestore: preparing drive xfsrestore: bad media file header at BOT indicates foreign or corrupted tape xfsrestore: media object not useful ============================ change media dialog ============================= please change media in drive 1: media change declined (timeout in 3600 sec) 2: media changed (default) -> 1 media change aborted --------------------------------- end dialog --------------------------------- This occurs regardless of which of the several tapes from the last 8 months I try, making it quite unlikely that all of the tapes are corrupted. This is the maximally verbose output from xfsrestore... root@snowbank restore]# xfsrestore -v5 -i -f /dev/nst0 . xfsrestore: RLIMIT_AS org cur 0xffffffffffffffff max 0xffffffffffffffff xfsrestore: RLIMIT_STACK org cur 0x800000 max 0xffffffffffffffff xfsrestore: raising stack size soft limit from 0x800000 to 0x2000000 xfsrestore: RLIMIT_STACK new cur 0x2000000 max 0xffffffffffffffff xfsrestore: RLIMIT_DATA org cur 0xffffffffffffffff max 0xffffffffffffffff xfsrestore: RLIMIT_FSIZE org cur 0xffffffffffffffff max 0xffffffffffffffff xfsrestore: RLIMIT_FSIZE now cur 0xffffffffffffffff max 0xffffffffffffffff xfsrestore: RLIMIT_CPU cur 0xffffffffffffffff max 0xffffffffffffffff xfsrestore: RLIMIT_CPU now cur 0xffffffffffffffff max 0xffffffffffffffff xfsrestore: INTGENMAX == 2147483647 (0x7fffffff) xfsrestore: UINTGENMAX == 4294967295 (0xffffffff) xfsrestore: OFF64MAX == 9223372036854775807 (0x7fffffffffffffff) xfsrestore: OFFMAX == -1 (0x7fffffff) xfsrestore: SIZEMAX == 4294967295 (0xffffffff) xfsrestore: INOMAX == 4294967295 (0xffffffff) xfsrestore: TIMEMAX == 2147483647 (0x7fffffff) xfsrestore: SIZE64MAX == 18446744073709551615 (0xffffffffffffffff) xfsrestore: INO64MAX == 18446744073709551615 (0xffffffffffffffff) xfsrestore: UINT64MAX == 18446744073709551615 (0xffffffffffffffff) xfsrestore: INT64MAX == 9223372036854775807 (0x7fffffffffffffff) xfsrestore: UINT32MAX == 4294967295 (0xffffffff) xfsrestore: INT32MAX == 2147483647 (0x7fffffff) xfsrestore: INT16MAX == 32767 (0x7fff) xfsrestore: UINT16MAX == 65535 (0xffff) xfsrestore: getpagesize( ) returns 4096 xfsrestore: parent pid is 3811 xfsrestore: using scsi tape (drive_scsitape) strategy xfsrestore: tty fd: 0; terminal interrupt character: (03) xfsrestore: version 3.0 - Running single-threaded xfsrestore: sizeof( pers_desc_t ) == 328, pgsz == 4096, perssz == 20480 xfsrestore: restore destination path converted from . to /snowbank/users/hlady/restore ::::::::::: persistent inventory media file tree at initialization ::::::::::: session inventory unknown ...................... end persistent inventory display ...................... xfsrestore: drive op: init xfsrestore: drive op: sync xfsrestore: Media_create xfsrestore: checking and validating command line dump id/label xfsrestore: searching media for dump xfsrestore: Media_mfile_next: purp==0 pos==0 xfsrestore: drive op: begin read xfsrestore: preparing drive xfsrestore: tape op: opening drive xfsrestore: tape op: get status xfsrestore: tape status = bot wprot onl xfsrestore: tape op: get block size info xfsrestore: max=1048576 cur=0 xfsrestore: variable block size tape drive at /dev/nst0 xfsrestore: tape op: get block size info xfsrestore: max=1048576 cur=0 xfsrestore: recommended tape media file size set to 0x10000000 bytes xfsrestore: recommended tape media mark separation set to 0x1000000 bytes xfsrestore: determining tape record size: trying 1048576 (0x100000) bytes xfsrestore: tape op: get status xfsrestore: tape status = bot wprot onl xfsrestore: tape positioned at BOT: doing redundant rewind xfsrestore: tape op: rewind 0 xfsrestore: tape op: get status xfsrestore: tape status = bot wprot onl xfsrestore: tape op: reading 1048576 bytes xfsrestore: tape op read of 1048576 bytes short: nread == 1024 xfsrestore: tape op: get status xfsrestore: tape status = wprot onl xfsrestore: nread > 0 and not EOD, not EOT, and not at a file mark on variable blocksize drive indicates correct blocksize found xfsrestore: validating media file header xfsrestore: validate_media_file_hdr gh_magic xFSdump0 gh_version 33554432 gh_checksum 4032098677 gh_timestamp 704502593 gh_ipaddr 14125681969463296000 gh_hostname lode.usask.ca gh_dumplabel xfsrestore: bad media file header checksum xfsrestore: bad media file header at BOT indicates foreign or corrupted tape xfsrestore: tape op: rewind 0 xfsrestore: tape op: get status xfsrestore: tape status = bot wprot onl xfsrestore: drive op: get device class xfsrestore: media object not useful xfsrestore: drive op: eject media xfsrestore: tape op: closing drive ============================ change media dialog ============================= please change media in drive 1: media change declined (timeout in 3600 sec) 2: media changed (default) -> 1 media change aborted --------------------------------- end dialog --------------------------------- Note that lode.usask.ca is the name of the host.
I think that -b 245760 (in my original dump parameters) is actually a completely incorrect blocksize which xfsdump was simply ignoring. If we do that the verbose output complains that the blocksize is wrong, which is consistent with it ignoring -b 245760 and in the verbose output of "xfsrestore -i -f /dev/nst0 ." it explicitly says that it has the correct block size. Should we believe this (i.e the line that says "xfsrestore: nread > 0 and not EOD, not EOT, and not at a file mark on variable blocksize drive indicates correct blocksize found" for the regular xfsrestore -i command)? Can ANYONE offer any suggestions or assistance? Please let me know if I need to provide more/different data, or take this to a different mailing list/forum/whathaveyou. Thank you in advance (and graduate students thank you too!) Jason -------------- Jason Hlady, B. Sc., M. Sc. (Chem), Adv. Cert. (Comp. Sci.) Programmer/Analyst (Bioinformatics/HPC Specialist) U of Saskatchewan, Bioinformatics Research Laboratory (BIRL) hlady@xxxxxxxxxxx (306) 966-2075 |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: incorrect quota usage, David Dougall |
|---|---|
| Next by Date: | Re: Problem restoring destroyed, important XFS filesystem from tape backup using xfsrestore, Tim Shimmin |
| Previous by Thread: | [Bug 290] XFS Kernel Memory Leak, bugzilla-daemon |
| Next by Thread: | Re: Problem restoring destroyed, important XFS filesystem from tape backup using xfsrestore, Tim Shimmin |
| Indexes: | [Date] [Thread] [Top] [All Lists] |