xfs
[Top] [All Lists]

Re: [PATCH] xfs: flush workers before stopping log

To: Ben Myers <bpm@xxxxxxx>
Subject: Re: [PATCH] xfs: flush workers before stopping log
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 31 Aug 2012 08:35:04 +1000
Cc: tinguely@xxxxxxx, xfs@xxxxxxxxxxx
In-reply-to: <20120830172549.GG3274@xxxxxxx>
References: <20120829134624.316257238@xxxxxxx> <20120829134628.835024558@xxxxxxx> <20120830002335.GB15292@dastard> <20120830172549.GG3274@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu, Aug 30, 2012 at 12:25:49PM -0500, Ben Myers wrote:
> On Thu, Aug 30, 2012 at 10:23:35AM +1000, Dave Chinner wrote:
> > On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@xxxxxxx wrote:
> > > Index: b/fs/xfs/xfs_mount.c
> > > ===================================================================
> > > --- a/fs/xfs/xfs_mount.c
> > > +++ b/fs/xfs/xfs_mount.c
> > > @@ -1490,6 +1490,11 @@ xfs_unmountfs(
> > >  
> > >   xfs_qm_unmount(mp);
> > >  
> > > + /* flush the worker queues while the log still exists and
> > > +  * before the final sync and unmount record.
> > > +  */
> > > + xfs_syncd_stop(mp);
> > 
> > xfs_syncd_stop() needs to die rather than being moved from place to
> > place every time some problem is seemd.  I outlined what we need to
> > do to solve the problems once and for all a couple of months ago:
> > 
> > http://oss.sgi.com/archives/xfs/2012-06/msg00064.html
> 
> Yikes, that patchset you've posted is very nice.  Probably it is appropriate
> for the 3.7 merge window.  We also need a fix for 3.5-stable and 3.6.  Do you
> have a suggestion for that?

3.7 - it's too late for 3.6.

> It looks like you've also moved the final log
> force to after cancellation of the work queue in patch 6, similar to what Mark
> has done... so it would seem that Mark has the right idea.  I think Mark's
> patch is appropriate for 3.6 and 3.5-stable inclusion to fix the regression
> introduced by me in 8866fc6.
> 
> What is your opinion?

Don't do unnecessary work. :)

Realistically, the question that I always ask myself in this
situation is: do we really need to fix it right now?

Backports to stable kernels are typically done in response to user
reported issues. If users are not hitting the problem, then there's
no compelling reason for backporting it. i.e. the answer is "no".
If someone is hitting the problem, then the answer is "yes" and we
need a temporary fix until the proper fix is done.

To that extent, I've seen one user report of an unmount problem in 3.5
(reported on #xfs), but it's not clear if the problem they hit was
this problem or the fact that the superblock IO is not being waited
for during unmount in 3.5. This was fixed in 3.6 by commit 9a57fa8e
("xfs: wait for the write the superblock on unmount"):

http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs.git;a=commit;h=9a57fa8ee7c29e11c2a29ce058573ba99157eda7

So, if we take this reported as a "yes, we need to fix it" trigger,
then we've got a couple of fixes that will need to be backported to
3.5.x..

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>