Re: [PATCH] Freeze bdevs when freezing processes.

To: David Chinner <dgc@xxxxxxx>
Subject: Re: [PATCH] Freeze bdevs when freezing processes.
From: Pavel Machek <pavel@xxxxxx>
Date: Wed, 25 Oct 2006 10:10:01 +0200
Cc: "Rafael J. Wysocki" <rjw@xxxxxxx>, Nigel Cunningham <ncunningham@xxxxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <20061025001331.GP8394166@xxxxxxxxxxxxxxxxx>
References: <1161576735.3466.7.camel@xxxxxxxxxxxxxxxxxx> <200610231236.54317.rjw@xxxxxxx> <20061024144446.GD11034@xxxxxxxxxxxxxxxxx> <200610241730.00488.rjw@xxxxxxx> <20061024163345.GG11034@xxxxxxxxxxxxxxxxx> <20061024213737.GD5662@xxxxxxxxxx> <20061025001331.GP8394166@xxxxxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.11+cvs20060126

> > > > Do you mean calling sys_sync() after the userspace has been frozen
> > > > may not be sufficient?
> > > 
> > > In most cases it probably is, but sys_sync() doesn't provide any
> > > guarantees that the filesystem is not being used or written to after
> > > it completes. Given that every so often I hear about an XFS filesystem
> > > that was corrupted by suspend, I don't think this is sufficient...
> > 
> > Userspace is frozen. There's noone that can write to the XFS
> > filesystem.
> Sure, no new userspace processes can write data, but what about the
> internal state of the filesystem?
> All a sync guarantees is that the filesystem is consistent when the
> sync returns and XFS provides this guarantee by writing all data and
> ensuring all metadata changes are logged so if a crash occurs it can
> be recovered (which provides the sync guarantee). hence after a
> sys_sync(), XFS will still have lots of dirty metadata that needs to
> be written to disk at some time in the future so the transactions
> can be removed from the log.
> This dirty metadata can be flushed at any time, and the dirty state
> is kept in XFS structures and not always in page structures (think
> multipage metadata buffers). Hence I cannot see how suspend can
> guarantee that it has saved all the dirty data in XFS, nor
> restore it correctly on resume. Once you toss dirty metadata that
> is currently in the log, further operations will result in that log
> transaction being overwritten without it ever being written to disk.
> That then means any subsequent operations after resume will corrupt
> the filesystem....
> Hence the only way to correctly rebuild the XFS state on resume is
> to quiesce the filesystem on suspend and thaw it on resume so as to
> trigger log recovery.

No, during suspend/resume, memory image is saved, and no state is
lost. We would not even have to do sys_sync(), and suspend/resume
would still work properly.

sys_sync() is there only to limit damage in case of suspend/resume

