[Top] [All Lists]

Re: xfs_repair breaks with assertion

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: xfs_repair breaks with assertion
From: Victor K <kvic45@xxxxxxxxx>
Date: Thu, 11 Apr 2013 14:34:32 +0800
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=83I0kNYNQhbSNKIL7y/w8J2Di+a5OzMxavddIVcn8hg=; b=XZgAseAaulxym7iV4Fur3RTNoztr+P4qktRHIfHnzvzLhwzNn+GLg/GCDCfWLDImAn 8kIcoB0sj9T8D5eXcxEFJTahX4GMP9WsduHRgESsZrnZg1Imb/ledayYQkkHosLTKGmh 5AypxNflt8f6v6RKe0jqM0r8iSwtbVbd3Omyp572JBWpYH5Ny30WAxbm/25CpCjA8z1X dY6vkZ+BAnR/r6+Qif0nbT7dgPWnBqEPBxNZDoe33yGmBOCTVU8jbdhhVKGE6VcaTgYu oOZvZ/j74sH/dQz0N67mQL4+MpB0lIlI6jBRdHlk9FtUUoKXTk3dl007Q6a5dJPAX18H cPyw==
In-reply-to: <20130411062515.GH10481@dastard>
References: <CAPaMSRCGSyhmnjrXpFFkEpmKrjsHqLn0kJ1xLGyf-WZosV7mmQ@xxxxxxxxxxxxxx> <20130411062515.GH10481@dastard>
> Running xfs_repair /dev/md1 the first time resulted in suggestion to
> mount/unmount to replay log, but mounting would not work. After running
> xfs_repair -v -L -P /dev/md1 this happens:
> (lots of output on stderr, moving to Phase 3, then more output - not sure
> if it is relevant, the log file is ~170Mb in size), then stops and prints
> the only line on stdout:

Oh dear. A log file that big indicates that something *bad* has
happened to the array. i.e that it has most likely been put back
together wrong.

Before going any further with xfs_repair, please verify that the
array has been put back together correctly....

The raid array did not suffer, at least, not according to mdadm; it is now happily recovering the one disk that officially failed, but the whole thing assembled without a problem
There was a similar crash several weeks ago on this same array, but had ext4 system back then.
I was able to save some of the latest stuff, and decided to move to xfs as something more reliable.
I suspect now I should also had replaced the disk controller then.
> xfs_repair: dinode.c:768: process_bmbt_reclist_int: Assertion `i <
> *numrecs' failed.
> Aborted
> After inserting a printf before the assert, I get the following:
> i = 0, *numrecs = -570425343 Âfor printf( "%d, %d")
> or
> i= 0, *numrecs = 3724541953 Âfor printf("%ld, %ld) - makes me wonder if
> it's signed/unsigned int related

numrecs is way out of the normal range, so that's probably what is
triggering it.

i.e this in process_exinode():

    numrecs = XFS_DFORK_NEXTENTS(dip, whichfork);

is where the bad number is coming from, and that implies a corrupted
inode. it's a __be32 on disk, the kernel considers it a xfs_extnum_t
in memory which is a int32_t because:

#define NULLEXTNUM Â Â Â((xfs_extnum_t)-1)

So, negative numbers on disk are invalid.

The patch below should fix the assert failure.

I'll try it - don't really have other options at the moment
> If I try now (after running xfs_repair -L) to mount the fs read-only, it
> mounts but says some directories have structures that need cleaning, so the
> dirs are inaccessible.
> Any suggestion on how to possibly fix this?

I suspect you've damaged it beyond repair now.

If the array was put back together incorrectly in the first place
(which is likely given the damage being reported), then
you've made the problem a whole lot worse by writing to it in an
attempt to repair it.

I'd suggest that you make sure the array is correctly
repaired/ordered/reocvered before doing anything else, then
running xfs_repair on what is left and hoping for the best. Even after
repair is finished, you'll need to go through all the data with a
fine toothed comb to work out what has been lost, corrupted or
overwritten with zeros or other stuff.

I suspect you'll be reaching for the backup tapes long before you
get that far, though...

Well, we'll see how it goes.

Thanks for the patch and the quick reply!



Dave Chinner

xfs_repair: validate on-disk extent count better

From: Dave Chinner <dchinner@xxxxxxxxxx>

When scanning a btree format inode, we trust the extent count to be
in range. ÂHowever, values of the range 2^31 <= cnt < 2^32 are
invalid and can cause problems with signed range checks. This
results in assert failures which validating the extent count such

xfs_repair: dinode.c:768: process_bmbt_reclist_int: Assertion `i < *numrecs' failed.

Validate the extent count is at least within the positive range of a
singed 32 bit integer before using it.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
Ârepair/dinode.c | Â 25 +++++++++++++++++++++++--
Â1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/repair/dinode.c b/repair/dinode.c
index 5a2da39..239bb7b 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -1293,7 +1293,7 @@ process_exinode(
    xfs_bmbt_rec_t     Â*rp;
    xfs_dfiloff_t      first_key;
    xfs_dfiloff_t      last_key;
-    int           numrecs;
+    int32_t         numrecs;
    int           ret;

    lino = XFS_AGINO_TO_INO(mp, agno, ino);
@@ -1302,6 +1302,15 @@ process_exinode(
    numrecs = XFS_DFORK_NEXTENTS(dip, whichfork);

+ Â Â Â Â* We've already decided on the maximum number of extents on the inode,
+ Â Â Â Â* and numrecs may be corrupt. Hence make sure we only allow numrecs to
+ Â Â Â Â* be in the range of valid on-disk numbers, which is:
+ Â Â Â Â* Â Â Â0 < numrecs < 2^31 - 1
+ Â Â Â Â*/
+ Â Â Â if (numrecs < 0)
+ Â Â Â Â Â Â Â numrecs = *nex;
+ Â Â Â /*
    Â* XXX - if we were going to fix up the btree record,
    Â* we'd do it right here. ÂFor now, if there's a problem,
    Â* we'll bail out and presumably clear the inode.
@@ -2038,11 +2047,23 @@ process_inode_data_fork(
    xfs_ino_t    lino = XFS_AGINO_TO_INO(mp, agno, ino);
    int       err = 0;
+    int       nex;
+ Â Â Â /*
+ Â Â Â Â* extent count on disk is only valid for positive values. The kernel
+ Â Â Â Â* uses negative values in memory. hence if we see negative numbers
+ Â Â Â Â* here, trash it!
+ Â Â Â Â*/
+ Â Â Â nex = be32_to_cpu(dino->di_nextents);
+ Â Â Â if (nex < 0)
+ Â Â Â Â Â Â Â *nextents = 1;
+ Â Â Â else
+ Â Â Â Â Â Â Â *nextents = nex;

- Â Â Â *nextents = be32_to_cpu(dino->di_nextents);
    if (*nextents > be64_to_cpu(dino->di_nblocks))
        *nextents = 1;

    if (dino->di_format != XFS_DINODE_FMT_LOCAL && type != XR_INO_RTDATA)
        *dblkmap = blkmap_alloc(*nextents, XFS_DATA_FORK);
    *nextents = 0;

<Prev in Thread] Current Thread [Next in Thread>