On Wed 07-09-11 20:51:05, Wu Fengguang wrote:
> On Wed, Sep 07, 2011 at 07:52:37PM +0800, Christoph Hellwig wrote:
> > On Mon, Sep 05, 2011 at 09:22:16PM +0800, Wu Fengguang wrote:
> > > > > That's a reasonable robust option, however at the cost of keeping the
> > > > > writeback code in some ambiguous state ;)
> > > > What do you exactly mean by ambiguous state?
> > >
> > > I mean in Christoph's case, it will be calling requeue_io() and at the
> > > same time rely on your suggested unconditional sleep at the end of
> > > wb_writeback() loop to avoid busy loop. Or in other words, b_more_io
> > > will be holding both inodes that should be busy retried and the inodes
> > > to be opportunistically retried. However I admit it's not a big
> > > problem if we take b_more_io as general "to be retried ASAP".
> > >
> > > > I don't see anything ambiguous in waiting for a jiffie or so. Not
> > > > that I'd be completely happy about "just wait for a while and see if
> > > > things are better" but your solution does not seem ideal either...
> > >
> > > There are no big differences (that matter) in terms of "how much exact
> > > time to wait" in this XFS case. What make me prefer b_more_io_wait is
> > > that it looks a more general solution to replace the majority
> > > redirty_tail() calls to avoid modifying dirtied_when.
> > FYI, we had a few more users hit this issue recently. I'm not sure why,
> > but we are seeing this fairly often now. I'd really like to get some
> > sort of fix for this in ASAP as it causes data loss for users.
> Jan, do you agree to push the b_more_io_wait patch into linux-next?
> If not, let's do a patch to do unconditional sleep at the end of the
> wb_writeback() loop?
Well, what I don't like about b_more_io_wait is that the logic shifting
inodes between lists becomes subtle and I'm afraid we could easily break it
in future. Also times when inodes are retried are not so well defined
although I agree that most likely that's not going to be a problem in
practice. So that's why I'd prefer to use more robust approach of just
waiting in the loop when we couldn't make any progress. I've just sent a
patch which does that and a patch which converts redirty_tail()s to
requeue_io() where it makes sense. Note that writeback_single_inode()
change is a bit more complex to keep livelock avoidance working. Please
have a look whether the patches would be fine with you. Thanks.
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR