xfs
[Top] [All Lists]

Re: [PATCH, -stable RFC] xfs: fix periodic log flushing

To: Andre Noll <maan@xxxxxxxxxxxxxxx>
Subject: Re: [PATCH, -stable RFC] xfs: fix periodic log flushing
From: Ben Myers <bpm@xxxxxxx>
Date: Tue, 18 Dec 2012 21:30:13 -0600
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <20121219002145.GK31273@xxxxxxxxxxxxxxx>
References: <20121218221616.GH15182@dastard> <20121218224022.GJ31273@xxxxxxxxxxxxxxx> <20121218232533.GJ15182@dastard> <20121219002145.GK31273@xxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
Hey Andre,

On Wed, Dec 19, 2012 at 01:21:45AM +0100, Andre Noll wrote:
> On Wed, Dec 19, 10:25, Dave Chinner wrote:
...
> > > > -       if (!(mp->m_super->s_flags & MS_ACTIVE) &&
> > > 
> > > appears to be in the longterm kernels 3.2.35 and 3.4.24 as well (it
> > > was changed in commit 1307bbd which got reverted in 11159a05). Are
> > > these kernels also affected?
> > 
> > I have no idea - I don't track them, don't test them and haven't
> > tried to reproduce the problem on them.
> > 
> > If you want to support all the stable trees, you're welcome to do
> > all this, but it's not something I care to do. We have reports of
> > this problem on 3.5 to 3.7 and the patch applies to all three
> > kernels, so that's as far as I care right now....
> 
> Understood. Personally, I only care about 3.4 as this is the kernel we
> are running on most of our production systems. Would you be willing
> to submit the patch also for 3.4-stable if Matthias or myself
> reproduced the issue on 3.4 and confirmed that the patch fixes the
> problem there as well?

We had some trouble getting particular area of code settled down over the
course of a few releases.  Unfortunately we had some crashes on unmount during
that time which were not immediately reproduceable and that adds another
wrinkle to this.  

Looks to me like 3.4 doesn't have the problem that Dave is trying to address
here because it doesn't check for MS_ACTIVE in xfs_sync_worker.  You're already
good to go.

Dave, what you've done makes sense b/c MS_ACTIVE is set after mount time and
cleared at unmount.  This is the time during which we want the sync worker to
be running.  I do think that the check is racy:  The sync worker can check the
flag and continue at snail's pace, and there is nothing to prevent unmount
clearing the flag and wiping out the structures used by the sync worker.

Regards,
        Ben

<Prev in Thread] Current Thread [Next in Thread>