[Top] [All Lists]

Re: [PATCH RFC] xfs: use invalidate_inode_pages2_range for DIO writes

To: Chris Mason <clm@xxxxxx>
Subject: Re: [PATCH RFC] xfs: use invalidate_inode_pages2_range for DIO writes
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Sat, 9 Aug 2014 10:48:57 +1000
Cc: xfs@xxxxxxxxxxx, Eric Sandeen <sandeen@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <53E4F518.9030107@xxxxxx>
References: <53E4E03A.7050101@xxxxxx> <53E4F518.9030107@xxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Aug 08, 2014 at 12:04:40PM -0400, Chris Mason wrote:
> xfs is using truncate_pagecache_range to invalidate the page cache
> during DIO writes.  The other filesystems are calling
> invalidate_inode_pages2_range
> truncate_pagecache_range is meant to be used when we are freeing the
> underlying data structs from disk, so it will zero any partial ranges
> in the page.  This means a DIO write can zero out part of the page cache
> page, and it is possible the page will stay in cache.
> This one is an RFC because it is untested and because I don't understand
> how XFS is dealing with pages the truncate was unable to clear away.
> I'm not able to actually trigger zeros by mixing DIO writes with
> buffered reads.
> Signed-off-by: Chris Mason <clm@xxxxxx>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 8d25d98..c30c112 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -638,7 +638,10 @@ xfs_file_dio_aio_write(
>                                                   pos, -1);
>               if (ret)
>                       goto out;
> -             truncate_pagecache_range(VFS_I(ip), pos, -1);
> +
> +             /* what do we do if we can't invalidate the pages? */
> +             invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
> +                                           pos >> PAGE_CACHE_SHIFT, -1);

I don't think it can on XFS.

We're holding the XFS_IOLOCK_EXCL, so no other syscall based IO can
dirty pages, all the pages are clean, try_to_free_buffers() will
never fail, no-one can run a truncate operation concurently, and
so on.

The only thing that could cause an issue is a racing mmap page fault
dirtying the page. But if you are mixing mmap+direct IO on the same
file, you lost a long time ago so that's nothing new at all.

So, I'd just do:

                ret = invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
                                                pos >> PAGE_CACHE_SHIFT, -1);
                ret = 0;

That way we'll find out if it does ever fail, and if it does we can
take it from there.


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>