xfs
[Top] [All Lists]

Re: [PATCH] xfs:negative_icount.patch

To: Stuart Brodsky <sbrodsky@xxxxxxx>
Subject: Re: [PATCH] xfs:negative_icount.patch
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 19 Jul 2010 10:14:03 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <1279306421.17689.38.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>
References: <1279306421.17689.38.camel@xxxxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Fri, Jul 16, 2010 at 01:53:41PM -0500, Stuart Brodsky wrote:
> This patch fixes the stat -f icount field going negative.  The use of
> lazy counts means that the super block field sb_icount is not updated
> correctly and will allow over allocation of inodes above maxicount.
> 
> Signed-off-by: Stu Brodsky <sbrodsky@xxxxxxx>
> 
> Index: a/fs/xfs/xfs_ialloc.c
> ===================================================================
> --- a/fs/xfs/xfs_ialloc.c.orig        2010-07-16 10:12:02.000000000 -0500
> +++ b/fs/xfs/xfs_ialloc.c     2010-07-16 10:19:56.312633030 -0500
> @@ -260,7 +260,7 @@
>        */
>       newlen = XFS_IALLOC_INODES(args.mp);
>       if (args.mp->m_maxicount &&
> -         args.mp->m_sb.sb_icount + newlen > args.mp->m_maxicount)
> +         atomic64_read(&args.mp->m_inodesinuse) + newlen >
> args.mp->m_maxicount)
>               return XFS_ERROR(ENOSPC);
>       args.minlen = args.maxlen = XFS_IALLOC_BLOCKS(args.mp);
>       /*
> @@ -708,7 +708,7 @@
>        */
>  
>       if (mp->m_maxicount &&
> -         mp->m_sb.sb_icount + XFS_IALLOC_INODES(mp) > mp->m_maxicount) {
> +         atomic64_read(&mp->m_inodesinuse) + XFS_IALLOC_INODES(mp) >
> mp->m_maxicount)
>               noroom = 1;
>               okalloc = 0;
>       }

.....

> Index: a/fs/xfs/xfs_trans.c
> ===================================================================
> --- a/fs/xfs/xfs_trans.c.orig 2010-07-16 10:12:02.000000000 -0500
> +++ b/fs/xfs/xfs_trans.c      2010-07-16 10:22:47.690249548 -0500
> @@ -805,6 +805,7 @@
>       switch (field) {
>       case XFS_TRANS_SB_ICOUNT:
>               tp->t_icount_delta += delta;
> +             atomic64_add(delta,&tp->t_mountp->m_inodesinuse);
>               if (xfs_sb_version_haslazysbcount(&mp->m_sb))
>                       flags &= ~XFS_TRANS_SB_DIRTY;
>               break;

This is still racy. It may fix your specific reproducer but I can't see how
changing the in use inode count from late in xfs_trans_commit() to early in
xfs_trans_commit() solves the problem: if we look at it
like this:


        thread 1                thread 2
        check count
        <start alloc>
        .....
                                check count
                                <start alloc>
                                ....
        <update count>


All you've done if change where <update count> occurs rather than
solving the race that prevents negative inode counts.

Realistically, there is nothing wrong with XFS exceeding mp->m_maxicount by a
small amount - it's just an arbitrary limit that we use to prevent all the
filesytem space from being taken up by inodes.  I think the problem is just a
reporting problem here in xfs_fs_statfs():

1245         fakeinos = statp->f_bfree << sbp->sb_inopblog;
1246         statp->f_files =
1247             MIN(sbp->sb_icount + fakeinos, (__uint64_t)XFS_MAXINUMBER);
1248         if (mp->m_maxicount)
1249                 statp->f_files = min_t(typeof(statp->f_files),
1250                                         statp->f_files,
1251                                         mp->m_maxicount);
1252         statp->f_ffree = statp->f_files - (sbp->sb_icount - sbp->sb_ifree);

Assume no free space and no free inodes (statp->f_bfree = 0, => fakeinos = 0),
and we have just hit the above race condition to allocate more inodes than
mp->m_maxicount. Hence we have sbp->sb_icount > mp->m_maxicount and that gives:

        statp->f_files = min(sbp->sb_icount, mp->m_maxicount);
        statp->f_files = mp->m_maxicount;

And now we've allocated all the free inodes (sbp->sb_ifree = 0). Therefore:

        statp->f_ffree = mp->m_maxicount - (sbp->sb_icount - 0)

Which gives statp->f_ffree < 0 and hence negative free inodes. I think
all that is required to "fix" this is to clamp statp->f_ffree to zero.
i.e.:

        if (statp->f_ffree < 0)
                statp->f_ffree = 0;

Make sense?

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>