On Fri, Jul 16, 2010 at 01:53:41PM -0500, Stuart Brodsky wrote:
> This patch fixes the stat -f icount field going negative. The use of
> lazy counts means that the super block field sb_icount is not updated
> correctly and will allow over allocation of inodes above maxicount.
>
> Signed-off-by: Stu Brodsky <sbrodsky@xxxxxxx>
>
> Index: a/fs/xfs/xfs_ialloc.c
> ===================================================================
> --- a/fs/xfs/xfs_ialloc.c.orig 2010-07-16 10:12:02.000000000 -0500
> +++ b/fs/xfs/xfs_ialloc.c 2010-07-16 10:19:56.312633030 -0500
> @@ -260,7 +260,7 @@
> */
> newlen = XFS_IALLOC_INODES(args.mp);
> if (args.mp->m_maxicount &&
> - args.mp->m_sb.sb_icount + newlen > args.mp->m_maxicount)
> + atomic64_read(&args.mp->m_inodesinuse) + newlen >
> args.mp->m_maxicount)
> return XFS_ERROR(ENOSPC);
> args.minlen = args.maxlen = XFS_IALLOC_BLOCKS(args.mp);
> /*
> @@ -708,7 +708,7 @@
> */
>
> if (mp->m_maxicount &&
> - mp->m_sb.sb_icount + XFS_IALLOC_INODES(mp) > mp->m_maxicount) {
> + atomic64_read(&mp->m_inodesinuse) + XFS_IALLOC_INODES(mp) >
> mp->m_maxicount)
> noroom = 1;
> okalloc = 0;
> }
.....
> Index: a/fs/xfs/xfs_trans.c
> ===================================================================
> --- a/fs/xfs/xfs_trans.c.orig 2010-07-16 10:12:02.000000000 -0500
> +++ b/fs/xfs/xfs_trans.c 2010-07-16 10:22:47.690249548 -0500
> @@ -805,6 +805,7 @@
> switch (field) {
> case XFS_TRANS_SB_ICOUNT:
> tp->t_icount_delta += delta;
> + atomic64_add(delta,&tp->t_mountp->m_inodesinuse);
> if (xfs_sb_version_haslazysbcount(&mp->m_sb))
> flags &= ~XFS_TRANS_SB_DIRTY;
> break;
This is still racy. It may fix your specific reproducer but I can't see how
changing the in use inode count from late in xfs_trans_commit() to early in
xfs_trans_commit() solves the problem: if we look at it
like this:
thread 1 thread 2
check count
<start alloc>
.....
check count
<start alloc>
....
<update count>
All you've done if change where <update count> occurs rather than
solving the race that prevents negative inode counts.
Realistically, there is nothing wrong with XFS exceeding mp->m_maxicount by a
small amount - it's just an arbitrary limit that we use to prevent all the
filesytem space from being taken up by inodes. I think the problem is just a
reporting problem here in xfs_fs_statfs():
1245 fakeinos = statp->f_bfree << sbp->sb_inopblog;
1246 statp->f_files =
1247 MIN(sbp->sb_icount + fakeinos, (__uint64_t)XFS_MAXINUMBER);
1248 if (mp->m_maxicount)
1249 statp->f_files = min_t(typeof(statp->f_files),
1250 statp->f_files,
1251 mp->m_maxicount);
1252 statp->f_ffree = statp->f_files - (sbp->sb_icount - sbp->sb_ifree);
Assume no free space and no free inodes (statp->f_bfree = 0, => fakeinos = 0),
and we have just hit the above race condition to allocate more inodes than
mp->m_maxicount. Hence we have sbp->sb_icount > mp->m_maxicount and that gives:
statp->f_files = min(sbp->sb_icount, mp->m_maxicount);
statp->f_files = mp->m_maxicount;
And now we've allocated all the free inodes (sbp->sb_ifree = 0). Therefore:
statp->f_ffree = mp->m_maxicount - (sbp->sb_icount - 0)
Which gives statp->f_ffree < 0 and hence negative free inodes. I think
all that is required to "fix" this is to clamp statp->f_ffree to zero.
i.e.:
if (statp->f_ffree < 0)
statp->f_ffree = 0;
Make sense?
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|