xfs
[Top] [All Lists]

Re: [PATCH 5/9] repair: detect CRC errors in AG headers

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: [PATCH 5/9] repair: detect CRC errors in AG headers
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 16 Apr 2014 07:52:14 +1000
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140415194029.GC3470@xxxxxxxxxxxxxx>
References: <1397550301-31883-1-git-send-email-david@xxxxxxxxxxxxx> <1397550301-31883-6-git-send-email-david@xxxxxxxxxxxxx> <20140415194029.GC3470@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Apr 15, 2014 at 03:40:29PM -0400, Brian Foster wrote:
> On Tue, Apr 15, 2014 at 06:24:57PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > repair doesn't currently detect verifier errors in AG header
> > blocks - apart from the primary superblock they are not detected.
> > They are, fortunately, corrected in the important cases (AGF, AGI
> > and AGFL) because these structures are rebuilt in phase 5, but if
> > you run xfs_repair in checking mode it won't report them as bad.
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
....
> > @@ -1285,7 +1287,7 @@ scan_ag(
> >                     do_warn(_("would reset bad agf for ag %d\n"), agno);
> >             }
> >     }
> > -   if (status & XR_AG_AGI)  {
> > +   if (agi_dirty || status & XR_AG_AGI)  {
> >             if (!no_modify) {
> >                     do_warn(_("reset bad agi for ag %d\n"), agno);
> >                     agi_dirty = 1;
> 
> There are a few asserts a bit further down this function that assume
> *_dirty is set only when in !no_modify mode. E.g.:
> 
>       ASSERT(agi_dirty == 0 || (agi_dirty && !no_modify));
> 
> You'll probably want to remove those. Or...

I thought I caught all of those...

> 
> > @@ -1295,15 +1297,9 @@ scan_ag(
> >     }
> >  
> >     if (status && no_modify)  {
> > -           libxfs_putbuf(agibuf);
> > -           libxfs_putbuf(agfbuf);
> > -           libxfs_putbuf(sbbuf);
> > -           free(sb);
> > -
> >             do_warn(_("bad uncorrected agheader %d, skipping ag...\n"),
> >                     agno);
> > -
> > -           return;
> > +           goto out_free;
> >     }
> 
> Would we want to skip the ag, as such, on a CRC error in no_modify mode?
> If so, perhaps we could set the status variable on crc errors and
> bitwise or the value returned from verify_set_agheader().

I figured that for CRC errors we really should try to validate
everything if we can. If nothing else fails the validation, then it
might be a bit flip in an unused portion of the sector and in that
case we can actually continue onwards. IOWs, a CRC error is not
necessarily fatal....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>