xfs
[Top] [All Lists]

Re: [PATCH v2] xfs: don't zero partial page cache pages during O_DIRECT

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH v2] xfs: don't zero partial page cache pages during O_DIRECT
From: Chris Mason <clm@xxxxxx>
Date: Tue, 19 Aug 2014 15:24:48 -0400
Cc: <xfs@xxxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=message-id : date : from : mime-version : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=facebook; bh=714u2T/Rb7YBPAdPSfK/NWidRfzN8wu0rUqOen/qeQc=; b=lLNwhoPF0sDmO2A6oxiQ+lIQaFqI9fYh6VRepaiWnPMHRwMWDnuGRzrIfFw5+mAo9cCf WFjnkDzD/HvfjvZGvOLEWhNO6yZtyXQF9ILTTtsh8wJuLWNskUwtlj4MAoDDTkFuTJ47 C//dAORAvjVm4ssd9mjIqEtJF7eza5EIlkE=
In-reply-to: <20140812011743.GU20518@dastard>
References: <53E4E03A.7050101@xxxxxx> <53E61A9C.4020807@xxxxxx> <20140812011743.GU20518@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
On 08/11/2014 09:17 PM, Dave Chinner wrote:
> On Sat, Aug 09, 2014 at 08:57:00AM -0400, Chris Mason wrote:
>>
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index 1f66779..023d575 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -295,7 +295,8 @@ xfs_file_read_iter(
>>                              xfs_rw_iunlock(ip, XFS_IOLOCK_EXCL);
>>                              return ret;
>>                      }
>> -                    truncate_pagecache_range(VFS_I(ip), pos, -1);
>> +                    invalidate_inode_pages2_range(VFS_I(ip)->i_mapping,
>> +                                          pos >> PAGE_CACHE_SHIFT, -1);
>>              }
>>              xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
>>      }
> 
> I added the WARN_ON_ONCE(ret) check to this and I am seeing it fire
> occasionally. It always fires immediately before some other ASSERT()
> they fires with a block map/page cache inconsistency. It usually
> fires in a test that runs fsx or fsstress. The fsx failures are new
> regressions caused by this patch. e.g. generic/263 hasn't failed for
> months on any of my systems and this patch causes it to fail
> reliably on my 1k block size test config.
> 
> I'm going to assume at this point that this is uncovering some other
> existing bug, but it means I'm not going to push this fix until I
> understand what is actually happening here. It is possible that what
> I'm seeing is related to Brian's collapse range bug fixes, but until
> I applied this direct IO patch I'd never seen fsx throw ASSERTs in
> xfs_bmap_shift_extents()....
> 
> Either way, more testing and understanding is needed.

Do you have the output from xfs and the command line args it used?  For
my device, it picks:

-r 4096 -t 512 -w 512 -Z

And for a blocksize 1024 test I did mkfs.xfs -b size=1024

But I can't trigger failures with or without the invalidate_inode_pages2
change.  I was hoping to trigger on 3.16, and then jump back to 3.10 +
my patch to see if the patch alone was at fault.

-chris

<Prev in Thread] Current Thread [Next in Thread>