xfs
[Top] [All Lists]

Re: [PATCH 5/6] xfs: add xfs_verifier_error()

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [PATCH 5/6] xfs: add xfs_verifier_error()
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 10 Feb 2014 22:10:42 +1100
Cc: Eric Sandeen <sandeen@xxxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <52F85294.6010309@xxxxxxxxxxx>
References: <52F83630.4020008@xxxxxxxxxx> <52F83A8D.7030506@xxxxxxxxxxx> <20140210034321.GR13647@dastard> <52F85294.6010309@xxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sun, Feb 09, 2014 at 10:16:20PM -0600, Eric Sandeen wrote:
> On 2/9/14, 9:43 PM, Dave Chinner wrote:
> > On Sun, Feb 09, 2014 at 08:33:49PM -0600, Eric Sandeen wrote:
> >> We want to distinguish between corruption, CRC errors,
> >> etc.  In addition, the full stack trace on verifier errors
> >> seems less than helpful; it looks more like an oops than
> >> corruption.  
> >>
> >> Create a new function to specifically alert the user to
> >> verifier errors, which can differentiate between
> >> EFSCORRUPTED and CRC mismatches.  It doesn't dump stack
> >> unless the xfs error level is turned up high.
> >>
> >> Define a new error message (EFSBADCRC) to clearly identify
> >> CRC errors.  (Defined to EILSEQ, bad byte sequence)
> >>
> >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> >> ---
> >>  fs/xfs/xfs_error.c |   22 ++++++++++++++++++++++
> >>  fs/xfs/xfs_error.h |    3 +++
> >>  fs/xfs/xfs_linux.h |    1 +
> >>  3 files changed, 26 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
> >> index 9995b80..08d76f4 100644
> >> --- a/fs/xfs/xfs_error.c
> >> +++ b/fs/xfs/xfs_error.c
> >> @@ -178,3 +178,25 @@ xfs_corruption_error(
> >>    xfs_error_report(tag, level, mp, filename, linenum, ra);
> >>    xfs_alert(mp, "Corruption detected. Unmount and run xfs_repair");
> >>  }
> >> +
> >> +/*
> >> + * Warnings specifically for verifier errors.  Differentiate CRC vs. 
> >> invalid
> >> + * values, and omit the stack trace unless the error level is tuned high.
> >> + */
> >> +void
> >> +__xfs_verifier_error(
> >> +  const char              *func,
> >> +  struct xfs_buf          *bp)
> >> +{
> >> +  struct xfs_mount *mp = bp->b_target->bt_mount;
> >> +
> >> +  xfs_alert(mp,
> >> +"%sCorruption detected in %s, block 0x%llx. Unmount and run xfs_repair",
> >> +            bp->b_error == EFSBADCRC ? "CRC " : "", func, bp->b_bn);
> > 
> > Perhaps if we do this:
> > 
> >     xfs_alert(mp,
> > "Metadata %s detected at %pF, block 0x%llx. Unmount and run xfs_repair",
> >               bp->b_error == EFSBADCRC ? "CRC error"
> >                                        : "corruption", _RET_IP_, bp->b_bn);
> > 
> > We'll get a symbol of the form caller_name+0xoffset similar to a
> > stack dump. That way if we have multiple calls to a
> > xfs_verifier_error() inside a single function we get something that
> > tells us which call detected the error...
> 
> Hm, but the point of the switch based on error nrs was to require only
> one call in each ->verifier, and ...

Right, that's the current usage of it because we are simply
returning true/false from the checking code. Determining the exact
error is the report is much more useful - let's not lose sight of
the end goal....

> > Also, the use of _RET_IP_ gets rid of the need for the wrapper
> > macro....
> 
> 0x${SPLAT} is a lot less useful than i.e. "xfs_agi_read_verify"

Note the format string I used: "%pF". That decodes the _RET_IP_
into the function name and offset from the start of the function.
i.e. it returns xfs_agi_read_verify+0x<splat>.

> Printing the _RET_IP_ requires disassembly of that particular build
> to figure out where we went wrong, whereas printing __func__ makes it
> clear on the initial read of the dmesg.

The function is already there in plain test. You only need to go to
tools if you want ot know the exact line it came from....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>