xfs
[Top] [All Lists]

Re: [Fwd: [PATCH] Fix race in xfs_write() between direct and buffered I/

To: Lachlan McIlroy <lachlan@xxxxxxx>
Subject: Re: [Fwd: [PATCH] Fix race in xfs_write() between direct and buffered I/O with DMAPI]
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Mon, 8 Dec 2008 17:51:25 -0500
Cc: xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <493779B1.3010703@xxxxxxx>
References: <493779B1.3010703@xxxxxxx>
User-agent: Mutt/1.5.18 (2008-05-17)
On Thu, Dec 04, 2008 at 05:33:21PM +1100, Lachlan McIlroy wrote:
> --- a/fs/xfs/linux-2.6/xfs_lrw.c      2008-09-22 15:47:38.000000000 +1000
> +++ b/fs/xfs/linux-2.6/xfs_lrw.c      2008-09-22 15:50:56.000000000 +1000
> @@ -707,7 +707,6 @@ start:
>               }
>       }
> 
> -retry:
>       /* We can write back this queue in page reclaim */
>       current->backing_dev_info = mapping->backing_dev_info;
> 
> @@ -763,6 +762,17 @@ retry:
>       if (ret == -EIOCBQUEUED && !(ioflags & IO_ISAIO))
>               ret = wait_on_sync_kiocb(iocb);
> 
> +     isize = i_size_read(inode);
> +     if (unlikely(ret < 0 && ret != -EFAULT && *offset > isize))
> +             *offset = isize;
> +
> +     if (*offset > xip->i_size) {
> +             xfs_ilock(xip, XFS_ILOCK_EXCL);
> +             if (*offset > xip->i_size)
> +                     xip->i_size = *offset;
> +             xfs_iunlock(xip, XFS_ILOCK_EXCL);
> +     }
> +
>       if (ret == -ENOSPC &&
>           DM_EVENT_ENABLED(xip, DM_EVENT_NOSPACE) && !(ioflags & IO_INVIS)) {
>               xfs_iunlock(xip, iolock);

Moving these updates to before the dmapi nospace callout provale doesn't
make any changes to the non-dmapi codepath, so good from that
perspective.  And as you say above it makes sense to have this update
before the dmapi callout.

> @@ -776,20 +786,7 @@ retry:
>               xfs_ilock(xip, iolock);
>               if (error)
>                       goto out_unlock_internal;
> -             pos = xip->i_size;
> -             ret = 0;
> -             goto retry;
> -     }
> -
> -     isize = i_size_read(inode);
> -     if (unlikely(ret < 0 && ret != -EFAULT && *offset > isize))
> -             *offset = isize;
> -
> -     if (*offset > xip->i_size) {
> -             xfs_ilock(xip, XFS_ILOCK_EXCL);
> -             if (*offset > xip->i_size)
> -                     xip->i_size = *offset;
> -             xfs_iunlock(xip, XFS_ILOCK_EXCL);
> +             goto start;

Again all this won't affect non-dmapi operations, so OK with my mainline
hat on.  Now if we check what start does over the old retry labels:

 - calls generic_write_checks.  This could and should redo checks based
   on the new inode size, ok.
 - dmapi write even - shouldn't happen because eventsent is non-zero,
   ok.
 - O_DIRECT alignment validation.  Superflous, but harmless, ok.
 - check for exclusive lock.  This is what you said you wanted, and
   indded due to the lock dropping we need it.  But why don't
   you duplicate this check in the dmapi case below so that we
   only have to go to start once instead of possibly twice?
 - i_new_size update - needed due to the possible i_size changes, ok
 - ichgtime - if time passed since the last time we might want to
   re-updated it, ok
 - zero_eof, ok
 - setuid clearing, superflous, but harmless.

So the patch looks good to me, but as mention above it might be possible
to optimize it a littler more.
  

<Prev in Thread] Current Thread [Next in Thread>