xfs
[Top] [All Lists]

Re: [PATCH] xfs: fix agno increment in xfs_inumbers() loop

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [PATCH] xfs: fix agno increment in xfs_inumbers() loop
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 13 Oct 2014 09:35:16 +1100
Cc: xfs-oss <xfs@xxxxxxxxxxx>, Jie Liu <jeff.liu@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <543AC7C9.5040503@xxxxxxxxxxx>
References: <543AC7C9.5040503@xxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sun, Oct 12, 2014 at 01:26:17PM -0500, Eric Sandeen wrote:
> commit c7cb51d xfs: fix error handling at xfs_inumbers
> caused a regression in xfs_inumbers, which in turn broke
> xfsdump, causing incomplete dumps.
> 
> The loop in xfs_inumbers() needs to fill the user-supplied
> buffers, and iterates via xfs_btree_increment, reading new
> ags as needed.
> 
> But the first time through the loop, if xfs_btree_increment()
> succeeds, we continue, which triggers the ++agno at the bottom
> of the loop, and we skip to soon to the next ag - without
> the proper setup under next_ag to read the next ag.
> 
> Fix this by removing the agno increment from the loop conditional,
> and only increment agno if we have actually hit the code under
> the next_ag: target.
> 
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> ---
> 
> p.s. it's alarming that apparently no test in the dump group
> detects this problem!  I'll look into it.

The dump tests must not create more than an inode chunks worth
of files in each AG. Yup, see common/dump::_mk_fillconfig1() - maybe
35 files?

> So I can say that I've tested this with xfstests, but I guess
> that doesn't matter much, yet.
> 
> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> index f71be9c..f1deb96 100644
> --- a/fs/xfs/xfs_itable.c
> +++ b/fs/xfs/xfs_itable.c
> @@ -639,7 +639,8 @@ next_ag:
>               xfs_buf_relse(agbp);
>               agbp = NULL;
>               agino = 0;
> -     } while (++agno < mp->m_sb.sb_agcount);
> +             agno++;
> +     } while (agno < mp->m_sb.sb_agcount);

Yeah, it fixes the bug, but I don't like the structure of the code
that leads to it. What we actually have it two loops in one:

        do {
                while (inobt records exist) {
                        /* format record */
                }

        } while (++agno < mp->m_sb.sb_agcount)

But the inner loop is not implemented as a loop, it's implemented
split across two iterations on the outer loop (i.e.  increment in
one iteration, formatting in the next). Hence the bug.

The code should really be further factored to:

        do {

                /* read agi */
                error = xfs_ialloc_read_agi(agno, &agbp)
                if (error)
                        break;

                error = xfs_inumbers_ag(...);
                if (error)
                        break;

                xfs_buf_relse(agbp);
                agino = 0;
                agno++;
        } while (agno < mp->m_sb.sb_agcount)

So that the AG is traversed and records processed within it's own
cursor-based traversal loop inside xfs_inumbers_ag(), not split
across the outer ag scope loop.

I'll take the first patch right away, but can you follow up with
the above refactoring, Eric?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>