[Top] [All Lists]

Re: GNU 'tar', Schilling's 'tar', write-cache/barrier

To: Peter Grandi <pg_xf2@xxxxxxxxxxxxxxxxxx>
Subject: Re: GNU 'tar', Schilling's 'tar', write-cache/barrier
From: Brian Candler <B.Candler@xxxxxxxxx>
Date: Sat, 24 Mar 2012 17:11:50 +0000
Cc: Linux fs XFS <xfs@xxxxxxxxxxx>
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date:from:to :cc:subject:message-id:references:mime-version:content-type :in-reply-to:content-transfer-encoding; s=sasl; bh=7Fr1ABw+Tb38d QUC1fOmqzBs+y0=; b=LcZrexjy5gJvatGSee2Xg5XECiVvAohZImtf8EPD94aEE Fa2z9aZFWK05xiC/l97BnSNwjFLTE1PS5mthHadlIZCtC2tVZwafjvViC6oMmcDe 2y4VZHshztVFlYaPDek4iYtavKushkDmXcoLkPRfwtW1hNEZhQkbQzUfCrEc8s=
Domainkey-signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:from:to:cc :subject:message-id:references:mime-version:content-type :in-reply-to:content-transfer-encoding; q=dns; s=sasl; b=FloNuMU cO59RCsNWiElzqcKchO4cgrJS/0RQr641X53fMXblKWy6TB2ThtOoB6zRCgiESSs mrM+wRNENb7l2bJek/7Fm+Mxsk9ay/uezOHZM1ZMEm3gQWwSiwvdT4dgjDVj2qmU Ksvr55EmQ2IYG6dP+mZbHCCZHMhHYQaal/G0=
In-reply-to: <20333.62951.784434.92213@xxxxxxxxxxxxxxxxxx>
References: <CAA8mOyDKrWg0QUEHxcD4ocXXD42nJu0TG+sXjC4j2RsigHTcmw@xxxxxxxxxxxxxx> <4F6624A3.5010206@xxxxxxxxxxxxxxxxx> <20331.39194.377610.888636@xxxxxxxxxxxxxxxxxx> <201203232348.09158.Martin@xxxxxxxxxxxx> <20333.8944.573177.821944@xxxxxxxxxxxxxxxxxx> <20333.62951.784434.92213@xxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sat, Mar 24, 2012 at 04:27:19PM +0000, Peter Grandi wrote:
> --------------------------------------------------------------
> #  (cd /tmp/ext4; rm -rf linux-2.6.32; sync; time star -no-fsync -x -f 
> /tmp/linux-2.6.32.tar; egrep 'Dirty|Writeback' /proc/meminfo; time sync)
> star: 37343 blocks + 0 bytes (total of 382392320 bytes = 373430.00k).
> real    0m1.204s
> user    0m0.139s
> sys     0m1.270s
> Dirty:          419456 kB
> Writeback:           0 kB
> real    0m5.012s
> user    0m0.000s
> sys     0m0.458s
> --------------------------------------------------------------
> #  (cd /tmp/ext4; rm -rf linux-2.6.32; sync; time star -x -f 
> /tmp/linux-2.6.32.tar; egrep 'Dirty|Writeback' /proc/meminfo; time sync)
> star: 37343 blocks + 0 bytes (total of 382392320 bytes = 373430.00k).
> real    23m29.346s
> user    0m0.327s
> sys     0m2.280s
> Dirty:             108 kB
> Writeback:           0 kB
> real    0m0.236s
> user    0m0.000s
> sys     0m0.199s

But as a user, what guarantees do I *want* from tar?

I think the only meaningful guarantee I might want is: "if the tar returns
successfully, I want to know that all the files are persisted to disk".  And
of course that's what your final "sync" does, although with the unfortunate
side-effect of syncing all other dirty blocks in the system too.

Calling fsync() after every single file is unpacked does also achieve the
desired guarantee, but at a very high cost.  This is partly because you have
to wait for each fsync() to return [although I guess you could spawn threads
to do them] but also because the disk can't aggregate lots of small writes
into one larger write, even when the filesystem has carefully allocated them
in adjacent blocks.

I think what's needed is a group fsync which says "please ensure this set of
files is all persisted to disk", which is done at the end, or after every N
files.  If such an API exists I don't know of it.

On the flip side, does fsync()ing each individual file buy you anything over
and above the desired guarantee?  Possibly - in theory you could safely
restart an aborted untar even through a system crash.  You would have to be
aware that the last file which was unpacked may only have been partially
written to disk, so you'd have to restart by overwriting the last item in
the archive which already exists on disk.  Maybe star has this feature, I
don't know.  And unlike zip, I don't think tarfiles are indexed, so you'd
still have to read it from the beginning.

If the above benchmark is typical, it suggests that fsyncing after every
file is 4 times slower than untar followed by sync.  So I reckon you would
be better off using the fast/unsafe version, and simply restarting it from
the beginning if the system crashed while you were running it.  That's
unless you expect the system to crash 4 or more times while you untar this
single file.

Just my 2¢, as a user and definitely not a filesystem expert.



<Prev in Thread] Current Thread [Next in Thread>