xfs
[Top] [All Lists]

Power loss and zero-length files

To: xfs@xxxxxxxxxxx
Subject: Power loss and zero-length files
From: Robert Widmer <robertwidmer@xxxxxxxxx>
Date: Fri, 23 Aug 2013 10:59:50 -0400
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=XfHpKYhBGQDgYbZaoY0MAIrWf4NYaQ5Voyyk2opbnCA=; b=iQEchppsUNEf2kzZfzhXKFg4y05AZ5W43CDHAOJWj61k/qbfPVOw5CnT03zbbSsSNQ IyzxGW4cH1GRkqAR4dM3e1k0HXXpHGK1UXDG4shsmFVGayVXU24PTyXBI3eUtmWf2/wG 6WKjhTLXqRX2wjq2ddkoenVwFmbaV22VV2mE307YcZM/AqXoriDfSbVNfX3MH2vn1CmG Xb3Rcyf9ms9Y+7/xQVTlBmLSjRh97RoXoGGDSBEjrGZwQoUnYdtTcxAJqLqzb0fuyQLA z3kDI02+EUUwr4K2lWHWPJeNOlfYLGrbb+pZXFyhwOhhKuO9B2nSdLwaK9QUc3mqFkA5 DAIQ==
I had a script that updated several files on an XFS filesystem using "sed -i", and someone decided to power cycle the box without a sync after running the script, and found that all the files that were updated were now zero-length.

Curious, I ran the following script to try and isolate the behavior:


#!/usr/bin/perl

my $dir = "/home/$ENV{USER}/XFSTest";
mkdir $dir;
chdir $dir;

my $filecount = 100;
my $tmpfile = 'file.tmp';

while (1) {
    for (my $i=0; $i<$filecount; $i++) {
my $filename = "file.$i";
open(OUT, ">", $tmpfile);
        print OUT "Time:".localtime."\n";
        close OUT;
rename $tmpfile, $filename;
    }
}


On the following release/kernels in a VM:

Fedora 16 w/kernel 3.1.0-7.fc16.x86_64
Fedora 16 w/kernel 3.6.11-4.fc16.x86_64
Fedora 19 w/kernel 3.10-7.200.fc19.x86_64
Ubuntu 13.04 w/kernel 3.8.0-19-generic


And after a power cycle, all the files are zero-length with no extents.

(CentOS 6.4 w/kernel 2.6.32-358.14.1.el6.centos.plus.x86_64 has the binary NULLS)

Barriers are not disabled and drive cache:
[    2.145011] sd 2:0:0:0: [sda] Cache data unavailable
[    2.145013] sd 2:0:0:0: [sda] Assuming drive cache: write through


The closest thing I can find in the documentation is the XFS FAQ which mentions "you are looking at an inode which was flushed out, but whose data was not", which seems to indicate that the inode writes and data writes are not done in order, but nothing explicitly documents this.

Is this expected behavior?

I've added a sync to the end of my script to try and ensure this does not happen again, and losing some amount of data after a power loss is expected, but it seems counter-intuitive that the inode/data writes are not done in order and that rapid file changes can result in such a large number of files being zero-length.

<Prev in Thread] Current Thread [Next in Thread>