On Wed, 2011-07-06 at 04:15 -0400, Christoph Hellwig wrote:
> On Tue, Jul 05, 2011 at 05:36:18PM -0500, Alex Elder wrote:
> > > A large part of the issue is that XFS writes data out itself two times
> > > in the ->sync_fs method, overriding the lifelock protection in the core
> > > writeback code, and another issue is the lock-less xfs_ioend_wait call,
> > > which doesn't prevent new ioend from beeing queue up while waiting for
> > > the count to reach zero.
> > The change affects only the first thing you mention here, not
> > the second.
> It does. We're also removing the xfs_ioend_wait done from
> xfs_sync_data/xfs_sync_inode_data. We still have another one in
> ->write_inode, though.
OK, now I see what you're talking about. I guess the way it was
stated I expected that the code would now *prevent* new ioends
from being queued while waiting.
> > The i_iocount wait is not affected by your patch.
> We're only removing one of the two we're doing per inode now.
> > I'm OK with the change, but really prefer to have
> > the description not include stuff that just isn't
> > there. If you want me to commit this as-is, just
> > say so and I will. Otherwise, post an update and
> > I'll use that. In any case, you can consider this
> > reviewed by me.
> If you have an idea how to reword the description send it my way.
Here's an attempt. (It also gives you a chance to correct
A large part of the issue is that XFS writes data out itself
two times in the ->sync_fs method, overriding the livelock
protection in the core writeback code. This patch removes
these XFS-internal sync calls and relies on the VFS to do it's
work just like all other filesystems do.
Another issue is the lock-less xfs_ioend_wait() call,
which doesn't prevent new ioends from being queued up
while waiting for the count to reach zero. Removing
the second SYNC_WAIT call to xfs_sync_data() eliminates
one place this is used unnecessarily by avoiding the
wait request at the end of xfs_sync_inode_data(). In
most cases there is no need to wait for ongoing writes
to make it to disk, as long as those queued at the time
of a sync request get flushed out.
We still wait like this in ->write_inode, and we should
remove that as well, but that's material for a separate