Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 11 Feb 2005 05:00:35 -0800 (PST) Received: from one.firstfloor.org (one.firstfloor.org [213.235.205.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id j1BD0WO6002398 for ; Fri, 11 Feb 2005 05:00:33 -0800 Received: by one.firstfloor.org (Postfix, from userid 502) id 52660D033E; Fri, 11 Feb 2005 14:00:30 +0100 (CET) To: linux xfs mailing list Cc: madduck@madduck.net Subject: Re: the thing with the binary zeroes References: <20050211121829.GA30049@localhost.localdomain> From: Andi Kleen Date: Fri, 11 Feb 2005 14:00:30 +0100 In-Reply-To: <20050211121829.GA30049@localhost.localdomain> (martin f. krafft's message of "Fri, 11 Feb 2005 13:18:29 +0100") Message-ID: User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV version 0.81, clamav-milter version 0.81b on oss.sgi.com X-Virus-Status: Clean X-archive-position: 4921 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: ak@muc.de Precedence: bulk X-list: linux-xfs Content-Length: 1506 Lines: 35 martin f krafft writes: [Sorry but that was really explained multiple times. Read the archives again for more details. Here just quick explanation.] > From what I understand, binary zeroes appear to replace file > contents after a system crash. The FAQ says that this is because the > file has been allocated, but the system had no time to write, so it > was empty. What surprises me here is that XFS nulls the file. > Normally, a newly allocated file is garbage to be overwritten, not > one zero after the other. Or are the zeroes overlaid on read() when > the inode is inconsistent? The new file size (=metadata) has been already flushed to disk, but the actual data hasn't yet. Missing data is a "hole" which reads as all zeros. If the system crashes in this window you see the hole. The main reason for this is that the flush delay for file data is much longer (minutes) compared to meta data like file size. Due to the way the XFS log works metadata tends to be flush very often. So you often have the updated metadata on disk, but no matching file data yet. There are various tunables in /prov/sys/vm to make file data be flushed faster (in 2.6 currently /proc/sys/vm/dirty_expire_centisecs, the names of these unfortunately change quite often). You can change that if you want, but in general making it to be flushed as often as the metadata would ruin performance. Of course you can sync any time manually with sync(1) or fsync(2)/fdatasync(2) in a program. -Andi