[PATCH V2] xfs: timestamp updates cause excessive fdatasync log traffic

Dave Chinner david at fromorbit.com
Sun Aug 30 21:21:55 CDT 2015


On Sat, Aug 29, 2015 at 08:04:54AM +1000, Dave Chinner wrote:
> On Fri, Aug 28, 2015 at 08:11:20AM -0700, Sage Weil wrote:
> > Hi Dave,
> > 
> > On Fri, 28 Aug 2015, Dave Chinner wrote:
> > > 
> > > From: Dave Chinner <dchinner at redhat.com>
> > > 
> > > Sage Weil reported that a ceph test workload was writing to the
> > > log on every fdatasync during an overwrite workload. Event tracing
> > > showed that the only metadata modification being made was the
> > > timestamp updates during the write(2) syscall, but fdatasync(2)
> > > is supposed to ignore them. The key observation was that the
> > > transactions in the log all looked like this:
> [....]
> 
> > > ---
> > > Version 2:
> > > - include the hunk from fs/xfs/xfs_trans_inode.c that I missed
> > >   when committing the patch locally the first time.
> > 
> > I gave this a go on my machine but I'm still seeing the same symptom.  
> 
> OK, that implies the inode buffer has not been submitted for IO and
> so the inode is being held in "flushing" state for an extended
> period of time.
> 
> > I've gathered the trace, strace, and other useful bits at
> > 
> >    http://newdream.net/~sage/drop/rocksdb.2/
> > 
> > This is pretty easy to reproduce with the ceph_test_keyvaluedb binary 
> > (built on fedora 22), also in that dir:
> > 
> >    rm -rf kv_test_temp_dir/
> >    ./ceph_test_keyvaluedb --gtest_filter=KeyValueDB/KVTest.BenchCommit/1
> 
> I'll have a deeper look.

Ok, I was assuming this is a longer running test than it is - it
only takes about 2300ms to run on my test box. Hence the problem is
that the inode has never been flushed out, and so it's being
relogged in full on every fdatasync() operation. Another, similar
change is necessary to track the changes since the last time the
inode was flushed to the log.

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com



More information about the xfs mailing list