Hi!
Thanks to all who have helped -- finally I have my data back!!!!
Here is a short summary of our problems, and how we were able to solve
it ("/dev/sdb" was the block device of our partition).
Symptoms:
1) "mount -t xfs /dev/sdb /mnt/home"
runs for hours, working it's way over the disk....
2) "xfs_repair -L /dev/sdb"
version 2.0.3:
xfs_repair: xfs_log_recover.c:159: xlog_find_verify_log_record:
Assertion `start_blk != 0 || *last_blk != start_blk' failed.
Aborted (core dumped)
version 2.0.3 with assertions disabled:
same as 1): runs for hours
Reason:
Log corruption that xfs_repair couldn't handle.
Cure:
Use xfsprogs from CVS after May 15th 2002. This version handles
log corruption more robust.
But our troubls continued. There was another...
Symptom:
3) after a (appearently) successful run of xfs_repair (sample output
http://slime.wu-wien.ac.at/xfs/repair.003.out) the filesystem was
still corrupt; another run of xfs_repair reported the same
corruptions...
Reason:
The internal log overlaps with other filesystem data (no idea how
this could happen!):
# xfs_db -r /dev/sdb
xfs_db: sb 0
xfs_db: p
magicnum = 0x58465342
blocksize = 4096
dblocks = 2000000
rblocks = 0
rextents = 0
uuid = 2604c50c-3a11-4d67-a80c-156580239afb
logstart = 4
rootino = 128
rbmino = 129
rsumino = 130
rextsize = 16
agblocks = 1000
agcount = 2000
rbmblocks = 0
logblocks = 512
[...]
==> logstart=4, logblocks=512, rootino=128
xfs_db: convert fsblock 4 daddr
0x20 (32)
xfs_db: convert ino 128 daddr
0x40 (64)
Thus: zeroing the log destroys the root inode. Later on, xfs_repair
restores the root inode and corrupts the log...
Cure:
Export the internal log to an external device. Steve Lord gave an
explanation how to do this:
http://marc.theaimsgroup.com/?l=linux-xfs&m=99535721116479&w=2
However, i had to add a slight modification (because a recent
xfs_repair is smarter and doesn't accept a change in the main
superblock only):
1. use xfsprogs >= 2.0 to avoid problems with endianess
2. gather data from xfs_db (see above)
blocksize = 4096
logblocks = 512
agcount = 2000
3. select a partition to become the new log and zero it:
dd if=/dev/zero of=/dev/NEWLOG bs=4096 count=512
4. reset the log offset to zero (= external log) using xfs_db
i did this for all agcount superblocks (setting only the main
superblock doesn't do it, since xfs_repair compares all
superblocks and takes the values from the majority):
# echo >xfs_db.py <<EOF
#!/usr/bin/env python
AGCOUNT=2000 # value from xfs_db
for i in xrange(AGCOUNT): # loop through ags
print "sb %d" % (i,) # select ag superblock
print "write logstart 0" # set logstart to 0
print "quit" # exit xfs_db
EOF
# ./xfs_db.py | xfs_db -x /dev/sdb >xfs_db.out
5. run xfs_repair with the external log (that way no data on the
filesystem are destroyed)
# xfs_repair -L -l/dev/NEWLOG /dev/sdb
6. mount the filesystem
# mount -t xfs -r -o logdev=/dev/NEWLOG /dev/sdb /mnt/home
That way i was able to rescue all my data. Thanks again for your help
and for bringing this *great* filesystem to us!
\wlang{}
--
Willi.Langenberger@xxxxxxxxxxxxx Fax: +43/1/31336/702
Zentrum fuer Informatikdienste, Wirtschaftsuniversitaet Wien, Austria
|