xfs
[Top] [All Lists]

Re: XFS write cache flush policy

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: XFS write cache flush policy
From: Matthias Schniedermeyer <ms@xxxxxxx>
Date: Wed, 19 Dec 2012 02:04:45 +0100
Cc: Lin Li <sdeber@xxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <20121218202914.GC15182@dastard>
References: <50C64C17.9080206@xxxxxxxxxxx> <20121214111924.GA4762@xxxxxxx> <20121215221622.GF9806@dastard> <20121216103025.GA14880@xxxxxxx> <20121216111046.GA16756@xxxxxxx> <20121216204847.GN9806@dastard> <20121216232251.GA20370@xxxxxxx> <20121217232441.GA5031@dastard> <20121218003438.GB30736@xxxxxxx> <20121218202914.GC15182@dastard>
User-agent: Mutt/1.5.21 (2010-09-15)
On 19.12.2012 07:29, Dave Chinner wrote:
> On Tue, Dec 18, 2012 at 01:34:38AM +0100, Matthias Schniedermeyer wrote:
> > On 18.12.2012 10:24, Dave Chinner wrote:
> > > 
> > > diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
> > > index 9500caf..7bf85e8 100644
> > > --- a/fs/xfs/xfs_sync.c
> > > +++ b/fs/xfs/xfs_sync.c
> > > @@ -400,7 +400,7 @@ xfs_sync_worker(
> > >    * cancel_delayed_work_sync on this work queue before tearing down
> > >    * the ail and the log in xfs_log_unmount.
> > >    */
> > > - if (!(mp->m_super->s_flags & MS_ACTIVE) &&
> > > + if ((mp->m_super->s_flags & MS_ACTIVE) &&
> > >       !(mp->m_flags & XFS_MOUNT_RDONLY)) {
> > >           /* dgc: errors ignored here */
> > >           if (mp->m_super->s_writers.frozen == SB_UNFROZEN &&
> > > 
> > > 
> > 
> > This also appears to fix the other case.
> > When the activity ceases sharply and the log is still not written after 
> > minutes.
> > 
> > After writing 10 files, waiting a minute, yanking ... all 10 files where 
> > there.
> > So the OP-case MIGHT have been this same error.
> > But that's the amateuer talking again.
> 
> I kinda deserved that, didn't I? ;)
> 
> But now I understand the problem, I agree with you that the OP was
> probably seeing the same bug. I understand the cause, and can
> explain exactly how it would cause both sets of symptoms reported...

Great.

That means less lost time in the future, when a USB-disc "decides" to go 
MIA.
The record was about 45 minutes lost, or something over 200GB just 
going up in smoke (without the smoke).


At least until such a bug is reintroduced in the future.
This Bug was introduced in 3.5(*) and existed up to 3.7 and if i 
understand you correctly was fixed more or less by accident for 3.8.

I'd say there is definitely something amiss in the test-suite, this is 
basic functionality that appears untested to me. (I don't know what the 
test-suite contains, only that it exists)

At least i'd count a dropped connection or power failure (The only 
difference is that in the latter case the cache MAY get dropped, 
otherwise i'd say both cases are basically the same) among the basic 
functionality that should be assured by a journaling fileystem.



*:
This is the snipped that introduced the part of the if that you changed 
above.
This is from `git diff v3.4..v3.5 -- fs/xfs/xfs_sync.c`

+        * We shouldn't write/force the log if we are in the mount/unmount
+        * process or on a read only filesystem. The workqueue still needs to be
+        * active in both cases, however, because it is used for inode reclaim
+        * during these times.  Use the MS_ACTIVE flag to avoid doing anything
+        * during mount.  Doing work during unmount is avoided by calling
+        * cancel_delayed_work_sync on this work queue before tearing down
+        * the ail and the log in xfs_log_unmount.
+        */
+       if (!(mp->m_super->s_flags & MS_ACTIVE) &&
+           !(mp->m_flags & XFS_MOUNT_RDONLY)) {

git blame dates the lines at 2012-05-21




-- 

Matthias

<Prev in Thread] Current Thread [Next in Thread>