[PATCH V2] xfs: timestamp updates cause excessive fdatasync log traffic
Dave Chinner
david at fromorbit.com
Mon Aug 31 16:51:27 CDT 2015
On Mon, Aug 31, 2015 at 05:40:04AM -0700, Sage Weil wrote:
> On Mon, 31 Aug 2015, Dave Chinner wrote:
> > After taking a tangent to find a tracepoint regression that was
> > getting in my way, I found that there was a significant pause
> > between the inode locking calls within xfs_file_fsync and the inode
> > locking calls on the buffered write. Roughly 8ms, in fact, on almost
> > every call. After adding a couple more test trace points into the
> > XFS fsync code, it turns out that a hardware cache flush is causing
> > the delay. That is, because we aren't doing log writes that trigger
> > cache flushes and FUA writes, we have to issue a
> > blkdev_issue_flush() call from xfs_file_fsync and that is taking 8ms
> > to complete.
>
> This is where my understanding of block layer flushing really breaks down,
> but in both cases we're issues flush requests to the hardware, right? Is
> the difference that the log write is a FUA flush request with data, and
> blkdev_issue_flush() issues a flush request without associated data?
Pretty much, though th elog write also does a cache flush before the
FUA write. i.e. The log writes consist of a bio with data issued via:
submit_bio(REQ_FUA | REQ_FLUSH | WRITE_SYNC, bio);
blkdev_issue_flush consists of an empty bio issued via:
submit_bio(REQ_FLUSH | WRITE_SYNC, bio);
So from a block layer and filesystem point of view there is little
difference, and the only difference at the SCSI layer is the WRITE
w/ FUA that is issued after the cache flush in the log write case
(see https://lwn.net/Articles/400541/ fo a bit more background).
I haven't looked any deeper than this so far - I don't have time
right now to do so...
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list