[PATCH] xfs: flush workers before stopping log

Ben Myers bpm at sgi.com
Thu Aug 30 12:25:49 CDT 2012


Hi Dave,

On Thu, Aug 30, 2012 at 10:23:35AM +1000, Dave Chinner wrote:
> On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely at sgi.com wrote:
> > The unmount race continues with our test boxes.
> > 
> > The below trace gave the clue that there is a write of the superblock
> > after the log UNMOUNT record and xfs_logprint confirmed this write.
> > 
> > A couple different experiments points to the sync worker. The simplest
> > solution is to moved the final flush of the workers before the final
> > superblock write so there is no other filesystem activity after the
> > UNMOUNT record is written to the log.
> 
> ....
> >  #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
> >  #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
> > #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
> > #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
> > #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
> > #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
> > #14 [c5377f58] process_one_work at c024ee4c
> > #15 [c5377f98] worker_thread at c024f43d
> > #16 [c5377fbc] kthread at c025326b
> > #17 [c5377fe8] kernel_thread_helper at c070e834
> > 
> > PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
> >  #0 [cde0fda0] __schedule at c0706595
> >  #1 [cde0fe28] schedule at c0706b89
> >  #2 [cde0fe30] schedule_timeout at c0705600
> >  #3 [cde0fe94] __down_common at c0706098
> >  #4 [cde0fec8] __down at c0706122
> >  #5 [cde0fed0] down at c025936f
> >  #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
> >  #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
> 
> OK, so you've got IO on the superblock buffer still active when the
> superblock is being freed.
> 
> > There should be no more I/O after the UNMOUNT record is written to the log.
> 
> That depends - a freeze leaves the filesystem in exactly this state.
> :)
> 
> > Flush the workers before the final sync of the superblock, write of the
> > UNMOUNT log record and tearing down the log.
> > 
> > This earlier flush prevents a late write of the superblock that raced with
> > the fiesystem shutdown.
> 
> I'm not sure the xfs_sync_work can be responsible for this - the
> xfs_sync_worker() has a MS_ACTIVE guard on it, so it will not log a
> dummy record (superblock) during the unmount procedure, nor does it
> dispatch supblock buffer IO, so it can not be responsible for the
> item in the log after the unmount record or the IO that is being
> run.



More information about the xfs mailing list