On Nov 28, 10:25am, Thomas Graichen wrote:
> Subject: Re: alpha again
> "Nathan Scott" <nathans@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > heh - thats completely bogus. so the problem is in the kernel
> > (xfs mount/umount code paths) after all.
> > my next best guess at the probable cause is that this may
> > be a blocksize related problem. we know that the primary
> > superblock is pretty much intact (otherwise xfs_db would have
> > gone haywire) - but since its offset is at start of blk 0,
> > we're always likely to get that right no matter what the page
> > & blksizes are, I think.
> so looks like the umount code trashes things - this would also make
> clear why xfs survives the dbench 64 - the filesystem seems to be
> stable while operating and only gets trashed on umount ...
ok, i've read through the umount code and have a theory.
(debugging by proxy is fun!) ;-)
is there any chance that the device block size is being
set back to 1024 at the end of the umount? i.e. at the
end of linvfs_put_super(), is the set_blocksize() call
being passed 1024? (throw a printk in there)
if so, is there a chance we are still doing IO at the end
of linvfs_put_super() -(Russell?)- in particular, is there
any chance we could still be writing out the superblock
after we've called set_blocksize() on the device?
i think this would produce the behavior you're seeing here
- if the underlying device blocksize was 1024 and we wrote
out the (512 byte) superblock thinking the blocksize was
512, well we'd end up putting random junk in the AGF since
thats the next 512 bytes right after the superblock.
if the blocksize does prove to be reset to something other
than 512, Thomas, could you try commenting out everything
between "/* Reset device block size */" and the end of the
function (linvfs_put_super) - 3/4 lines - and see if you
still see repair needing to fix the AGF after umount?
>> root@cyan:/usr/src/xfs/linux# xfs_repair /dev/sdb1
>> Phase 1 - find and verify superblock...
>> Phase 2 - using internal log
>> - zero log...
>> - scan filesystem freespace and inode maps...
>> bad magic # 0x0 for agf 0
>> bad version # -1 for agf 0
>> bad length 0 for agf 0, should be 4142
>> flfirst -2147483648 in agf 0 too large (max = 128)
>> reset bad agf for ag 0
>> freeblk count 1 != flcount 1084270339 in ag 0
>> bad agbno 2966461184 for btbno root, agno 0
>> bad agbno 16580607 for btbcnt root, agno 0
>> - found root inode chunk
>> Phase 3 - for each AG...
>> - scan and clear agi unlinked lists...
>> - process known inodes and perform inode discovery...
>> - agno = 0
>> - agno = 1
>> - agno = 2
>> - agno = 3