xfs
[Top] [All Lists]

Re: The segment fault with NULL point using when recovering failure

To: Mike Gao <ygao.linux@xxxxxxxxx>
Subject: Re: The segment fault with NULL point using when recovering failure
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 29 Sep 2010 16:05:09 +1000
Cc: xfs@xxxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>
In-reply-to: <AANLkTi=mGBXYLgwL96VqtF+8eMCxu83mz0V5719YPrkH@xxxxxxxxxxxxxx>
References: <AANLkTimR-dBLmQQ-Nh0mmjHJMfFidePKxfO6P76y48n8@xxxxxxxxxxxxxx> <20100917014412.GK24409@dastard> <AANLkTi=mGBXYLgwL96VqtF+8eMCxu83mz0V5719YPrkH@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
[Mike, please don't top-post responses - it makes it really hard to
quote properly. ]

On Fri, Sep 24, 2010 at 10:53:43AM -0500, Mike Gao wrote:
> On Thu, Sep 16, 2010 at 8:44 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> > On Wed, Sep 15, 2010 at 10:59:07AM -0500, Mike Gao wrote:
> > > xlog_recover_process_iunlinks(
> > >     xlog_t        *log)
> > > {
> > >                 /*
> > >                  * Reacquire the agibuffer and continue around
> > >                  * the loop. This should never fail as we know
> > >                  * the buffer was good earlier on.
> > >                  */
> > >                 error = xfs_read_agi(mp, NULL, agno, &agibp);
> > >                 ASSERT(error == 0);
> > >                 agi = XFS_BUF_TO_AGI(agibp);
> > >
> > > }
> > > If no defined DEBUG, ASSERT will not work and agibp could be ZERO if
> > there
> > > is a error in xfs_read_agi.
> > > And the comment shouldn't say it never fail because  xfs_read_agi will
> > > return error in function and it does when there is forced shutdown.
> >
> > Have you seen a failure here?
>
> I am curious about how this happens.

What, how a failure happens there? The ASSERT is saying that a
failure should never happen there....

> This error is caused by log recovery when mount try to fill super block. As
> super block is protected by log or Journaling, it shouldn't be corrupted or
> can't be recovery by journaling.

I'm not sure I follow you here - the above is reading an AGI, not a
superblock. What is the stack trace that you've seen?

> Anyway, this is reported as XFS INTERNAL ERROR.
> 
> XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4147 of file
> fs/xfs/xfs_bmap.c.  Caller 0x871b9c26

That is from a corrupted bmap btree, which has no connection to
the AGI. I'm confused by what problems you are trying to report.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>