On Thu, 2002-05-02 at 15:14, Greg Freemyer wrote:
> >> On Thu, 2 May 2002, Greg Freemyer wrote:
> >> > XFS team,
> >> >
> >> > I've just become aware of a problem on another Journaled file-system
> >> is not a Linux filesystem).
> >> A NAS device perhaps?
> Thanks for the info, and no it is a normal everyday FS. (And a mature one at
> that. Atleast 5 years old, but I guess hardware snapshots are relatively
> new. )
> The problem apparently occurs in both direct connect and SAN environments.
> As I understand it the trouble is that some meta-data can be in transition
> even for just a few millisecs. If the hardware snapshot is made at this
> time, you get corruption.
There is always data in transit, if you have a journaled filesystem then
if it is working correctly you should be able to do a hardware based
snapshot at any point in time and get a consistent filesystem back.
In that scenario the snapshot is basically the same sort of end result
as a sudden power failure. Your snapshot must be atomic across the
whole volume though, if you are running on top of some sort of raid
and internally its snapshot is actually a multi step process you can
end up with one part of the raid out of sync with the rest - actually
multiple raid cabinets is the case where that was possible.
The combination of that scenario, and the fact that you really want to
know what it is you are snapshotting led to the xfs_freeze command
and associated kernel code - it basically stops the filesystem and
simulates an unmount.
Some raid hardware vendors actually have hooks into filesystems so
that they can get a flush to happen at the right instant in time.
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: lord@xxxxxxx