kenj wrote:
> [...]
> With -u the chance becomes 0% of the archive appearing truncated in
> the absence of a system crash.
Well, almost -- with the fwrite(3) data going out in dribs & drabs
already, all those pre-fflush(3) moments can make the file appear
truncated to another reader. (One might see that with a abort()
inserted between the fwrite's and fflush's.)
The -u option is at least useful for a lesser level of correctness,
namely satisfying the pcp-archive.5 invariant that metadata must be
present for metric values in the .0 file, by fflush()ing the .meta
files before writing into the archive.
This -u is not sufficient to protect the data from system crashes;
one'd need fsync(2) syscalls in there too. It could be colocated with
the -u fflush()es, or left to the fche/fsync-prototype fclose().
> Not consistent if pmlogger dies or someone tries to read the archive
> while it is being written. It is not an on-disk issue. [...]
(It is, to the extent that some kernel-level write(2)s could occur in
sequences that are inconsistent.)
>> Not clear who (which tools?/code?) benefit from that, if anyone...?
>
> All the loggers run by pmlogger_{check,daily} are candidate beneficiaries.
... as is anyone who loves their data. :-) With the present scheme,
it's not hard to find pmlogger-generated archives that PMAPI refuses
to open. I've got a bunch here, whether resulting from an untimely
pmlogger exit or a system crash/reboot. (Note that our own tools
sometimes SIGKILL an intransigent pmlogger.)
(By the way, the same thing happens to systemd journals on my
machines/VMs with some regularity, and those become write-offs after a
"recovery" consisting of just moving the corrupt files out of place
and letting them rot until GC.)
- FChE
|