xfs
[Top] [All Lists]

Re: [PATCH v2 0/5] clean up collapse range and handle post-eof delalloc

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: [PATCH v2 0/5] clean up collapse range and handle post-eof delalloc
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 16 Sep 2014 08:55:35 +1000
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140915131822.GA4143@xxxxxxxxxxxxxx>
References: <1410355231-50495-1-git-send-email-bfoster@xxxxxxxxxx> <20140911044243.GA10111@dastard> <20140911152012.GB54638@xxxxxxxxxxxxxxx> <20140911211915.GA4322@dastard> <20140912195128.GA42029@xxxxxxxxxxxxxxx> <20140912200536.GB42029@xxxxxxxxxxxxxxx> <20140915014654.GD4267@dastard> <20140915131822.GA4143@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Sep 15, 2014 at 09:18:22AM -0400, Brian Foster wrote:
> On Mon, Sep 15, 2014 at 11:46:54AM +1000, Dave Chinner wrote:
> > xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly
> > 
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > XFS has been having trouble with stray delayed allocation extents
> > beyond EOF for a long time. Recent changes to the collapse range
> > code has triggered erroneous EBUSY errors on page invalidtion for
> > block size smaller than page size filesystems. These
> > have been caused by dirty buffers beyond EOF on a partial page which
> > do not get written to disk during a sync.
> > 
> > The issue is that write-ahead in xfs_cluster_write() finds such a
> > partial page and handles it by leaving the page dirty but pushing it
> > into a writeback state. This used to work just fine, as the
> > write_cache_pages() code would then find the dirty partial page in
> > the next mapping tree lookup as the dirty tag is still set.
> > 
> > Unfortunately, when we moved to a mark and sweep approach to
> > writeback to fix other writeback sync issues, we broken this. THe
> > act of marking the page as under writeback now clears the TOWRITE
> > tag in the radix tree, even though the page is still dirty. This
> > causes the TOWRITE tag to be cleared, and hence the next lookup on
> > the mapping tree does not find the dirty partial page and so doesn't
> > try to write it again.
> > 
> > This same writeback bug was found recently in ext4 and fixed in
> > commit 1c8349a ("ext4: fix data integrity sync in ordered mode")
> > without communication to the wider filesystem community. We can use
> > exactly the same fix here so the TOWRITE flag is not cleared on
> > partial page writes.
> > 
> > cc: stable@xxxxxxxxxxxxxxx # dependent on 
> > 1c8349a17137b93f0a83f276c764a6df1b9a116e
> > Root-cause-found-by: Brian Foster <bfoster@xxxxxxxxxx>
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> 
> Looks good and fixes the collapse failure in my test.
> 
> Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx>
> 
> I suppose we should prepend the collapse rework series with this patch
> to avoid the regression as it pertains to collapse (obviously the
> failure to retain towrite goes further back).

Agreed, I will do that.

> I'll continue testing with this. Are you still seeing an increase in
> such failures with the xfs_free_file_space() patch or has this quieted
> those down?

To early to sayi for sure, but signs are good - I've had xfstests
actually complete without any stray delalloc asserts occurring on
1k block size filesystems for the first time in a couple of weeks.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>