xfs
[Top] [All Lists]

ADD 800297 - repair doesn't recover if primary sb is trashed

To: nathans@xxxxxxxxxxxx
Subject: ADD 800297 - repair doesn't recover if primary sb is trashed
From: pv@xxxxxxxxxxxxx (nathans@xxxxxxxxxxxx)
Date: Mon, 28 Aug 2000 17:14:00 -0700 (PDT)
Cc: linux-xfs@xxxxxxxxxxx
Reply-to: sgi.bugs.xfs@xxxxxxxxxxxxxxxxx
Sender: owner-linux-xfs@xxxxxxxxxxx
Webexec: webpvupdate,pvincident
Webpv: wobbly.melbourne.sgi.com
View Incident: 
http://co-op.engr.sgi.com/BugWorks/code/bwxquery.cgi?search=Search&wlong=1&view_type=Bug&wi=800297

 Status : open                         Priority : 3                         
 Assigned Engineer : nathans           Submitter : nathans                  
*Modified User : nathans              *Modified User Domain : engr          
*Description :
In writing some verification tests for xfs_repair, I've found that
a the corrupted primary superblock is not currently recoverable on
Linux.

e.g.
sim/mkfs/mkfs_xfs /dev/foo
stress/src/devzero -b 1 -n 1 /dev/foo
sim/repair/xfs_repair /dev/foo

Phase 1 - find and verify superblock...

.....


==========================
ADDITIONAL INFORMATION (ADD)
From: nathans@engr (BugWorks)
Date: Aug 28 2000 05:14:00PM
==========================

OK, I'm close to having this sorted out, just need some input from
the gurus...

The situation at the moment is:
- libsim mkfs writes bad secondary superblocks
- libxfs mkfs writes good secondary superblocks
(as to why? - i don't know - I can only guess that the bflush
at the end of the old mkfs has the buffers marked as dirty but
not endian converted and flushes them out thus overwriting the
good stuff ... seems very odd though).

- repair does have an endian issue here after all... with a fix,
I get a nicely recovered fs with xfs_repair output like this...

(in gdb...)
run /dev/hda8
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
...found candidate secondary superblock...
verified secondary superblock...
writing modified primary superblock

Breakpoint 1, write_primary_sb (sbp=0x40209080, size=512) at sb.c:481
481             if (no_modify)
(gdb) p *sbp
$1 = {sb_magicnum = 1481003842, sb_blocksize = 4096, sb_dblocks = 38146, 
  sb_rblocks = 0, sb_rextents = 0, sb_uuid = {
    __u_bits = "x]o·´\205AÄ\231âK]ùòÀw"}, sb_logstart = 32772, 
  sb_rootino = 18446744073709551615, sb_rbmino = 18446744073709551615, 
  sb_rsumino = 18446744073709551615, sb_rextsize = 16, sb_agblocks = 4769, 
  sb_agcount = 8, sb_rbmblocks = 0, sb_logblocks = 1200, 
  sb_versionnum = 8324, sb_sectsize = 512, sb_inodesize = 256, 
  sb_inopblock = 16, sb_fname = "\000\000\000\000\000", 
  sb_fpack = "\000\000\000\000\000", sb_blocklog = 12 '\f', 
  sb_sectlog = 9 '\t', sb_inodelog = 8 '\b', sb_inopblog = 4 '\004', 
  sb_agblklog = 13 '\r', sb_rextslog = 0 '\000', sb_inprogress = 0 '\000', 
  sb_imax_pct = 25 '\031', sb_icount = 0, sb_ifree = 0, sb_fdblocks = 36914, 
  sb_frextents = 0, sb_uquotino = 0, sb_pquotino = 0, sb_qflags = 0, 
  sb_flags = 0 '\000', sb_shared_vn = 0 '\000', sb_inoalignmt = 2, 
  sb_unit = 0, sb_width = 0, sb_dirblklog = 0 '\000', 
  sb_dummy = "\000\000\000\000\000\000"}
(gdb) c
Continuing.
sb root inode value 18446744073709551615 inconsistent with calculated value 
137438953600
resetting superblock root inode pointer to 137438953600
sb realtime bitmap inode 18446744073709551615 inconsistent with calculated 
value 137438953601
resetting superblock realtime bitmap ino pointer to 137438953601
sb realtime summary inode 18446744073709551615 inconsistent with calculated 
value 137438953602
resetting superblock realtime summary ino pointer to 137438953602
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - clear lost+found (if it exists) ...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - ensuring existence of lost+found directory
        - traversing filesystem starting at / ... 
        - traversal finished ... 
        - traversing all unattached subtrees ... 
        - traversals finished ... 
        - moving disconnected inodes to lost+found ... 
Phase 7 - verify and correct link counts...
Note - stripe unit (0) and width (0) fields have been reset.
Please set with mount -o sunit=<value>,swidth=<value>
done


So, my question is - I know there's code in mkfs to go through and
sprinkle the known-good root inode into some AGs (looks like we use
the last AG and one in the middle - below the comment "write out
multiple copies of superblocks with the rootinode field set" in mkfs).
At this point we know what the root inode (+rt inodes) are, and we
have all the AGs setup, so why do we not write these inode numbers
in _all_ of the AG superblocks rather than just a couple? (would it
be worthwhile changing mkfs to do this?)

Looks like repair doesn't find the good one at the moment, or
doesn't keep looking for long enough (I suspect its picked the SB
in AG 1), so we get those "resetting" messages at the end of phase1.

many thanks.

<Prev in Thread] Current Thread [Next in Thread>