xfs
[Top] [All Lists]

Re: xfs_repair dumps core on damaged filesystem (was: Re: XFS assertion

To: Peter.Kelemen@xxxxxxx
Subject: Re: xfs_repair dumps core on damaged filesystem (was: Re: XFS assertion failed: vp->v_bh.bh_first != NULL)
From: "Nathan Scott" <nathans@xxxxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 7 Sep 2000 10:36:59 -0400
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: Steve Lord <lord@sgi.com> "Re: xfs_repair dumps core on damaged filesystem (was: Re: XFS assertion failed: vp->v_bh.bh_first != NULL)" (Sep 6, 5:29pm)
References: <200009062229.RAA25306@jen.americas.sgi.com>
Sender: owner-linux-xfs@xxxxxxxxxxx
hi,

On Sep 6,  5:29pm, Steve Lord wrote:
> Subject: Re: xfs_repair dumps core on damaged filesystem
> ...
> I would like to be able to duplicate this crash, or at least establish
> if it is a setup which we know has problems.
> ...
> The repair problem looks related to an already reported problem.
> 

I think this is actually subtely different, although from the
stack trace they do look very similar.  That other bug (800752)
seems to have been fixed in recent checkins - the QA test which
previously tripped it every time no longer does.

> > 
> > Sep  6 19:17:56 pcrd18 kernel: XFS assertion 
> > failed:xfs_bmbt_get_startoff(r1)
>  + xfs_bmbt_get_blockcount(r1) <=
> > xfs_bmbt_get_startoff(r2), file: xfs_btree.c, line: 300

This is interesting - xfs_repair falls over in a place which
also manipulates these same data structures, so I suspect
repair may be making some assumptions about the ondisk data
here...

> > #0  0x808e60e in scanfunc_bmap (ablock=0x819a280, level=1, type=5, 
> > whichfork=
> 0, bno=3669497, ino=148, tot=0xbffff7f8,
> >     nex=0xbffff7cc, blkmapp=0xbffff7a8, bm_cursor=0xbffff584, isroot=1, 
> > check
> _dups=0, dirty=0xbffff4e8) at scan.c:457
> > 457                     bm_cursor->level[level].last_key =
> > (gdb) bt
> > #0  0x808e60e in scanfunc_bmap (ablock=0x819a280, level=1, type=5, 
> > whichfork=
> 0, bno=3669497, ino=148, tot=0xbffff7f8,
> >     nex=0xbffff7cc, blkmapp=0xbffff7a8, bm_cursor=0xbffff584, isroot=1, 
> > check
> _dups=0, dirty=0xbffff4e8) at scan.c:457

        /*
         * update cursor keys to reflect this block
         */
        if (check_dups == 0)  {
                bm_cursor->level[level].first_key = 
INT_GET(pkey[0].br_startoff, ARCH_CONVERT);
                bm_cursor->level[level].last_key =
                                INT_GET(pkey[block->bb_numrecs-1].br_startoff, 
ARCH_CONVERT);
        }

hmm - assuming my source matches Peters, its the second assignment
(scan.c, line 457) there which is unhealthy.

Peter, any chance you could tar+gzip the core file & repair binary
and mail them to me?  -- taa.

Also, if you still have the bad filesystem handy, could you use
xfs_db and dump out everything you can about inode 148 and send
that to me too?  That would be something like:

# xfs_db -r /dev/hdc1
xfs_db: inode 148
xfs_db: print
xfs_db: addr u.bmbt.ptrs[1]
xfs_db: print
... to start with.

Finally, "xfs_repair -n /dev/hdc1" output might help me too.

(thats alot of stuff, could you send it just to me & not the
list...)


many thanks.

-- 
Nathan

<Prev in Thread] Current Thread [Next in Thread>