pcp
[Top] [All Lists]

Re: pmlogger -u questions

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: pmlogger -u questions
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Tue, 15 Apr 2014 07:15:32 +1000
Cc: Nathan Scott <nathans@xxxxxxxxxx>, pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <y0meh104nvl.fsf@xxxxxxxx>
References: <01e901cf56df$4ce97de0$e6bc79a0$@internode.on.net> <1665962954.4723287.1397437104781.JavaMail.zimbra@xxxxxxxxxx> <534B4330.1060008@xxxxxxxxxxxxxxxx> <y0meh104nvl.fsf@xxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
On 14/04/14 12:59, Frank Ch. Eigler wrote:

kenj wrote:

[...]
With -u the chance becomes 0% of the archive appearing truncated in
the absence of a system crash.

Well, almost -- with the fwrite(3) data going out in dribs & drabs
already, all those pre-fflush(3) moments can make the file appear
truncated to another reader.  (One might see that with a abort()
inserted between the fwrite's and fflush's.)

Which is why I was suggesting abandoning stdio buffering for the -u case and making this the default.

This -u is not sufficient to protect the data from system crashes;
one'd need fsync(2) syscalls in there too.  It could be colocated with
the -u fflush()es, or left to the fche/fsync-prototype fclose().

A simpler solution might be to offer O_SYNC I/O as an option.

Not consistent if pmlogger dies or someone tries to read the archive
while it is being written.  It is not an on-disk issue.  [...]

(It is, to the extent that some kernel-level write(2)s could occur in
sequences that are inconsistent.)

If I control the write(2) directly (no stdio as I'm suggesting) then the writes are guaranteed to be in a consistent order ... the code already does this above stdio, it is just the stdio buffering that masks some of the writes, and writes data that is not aligned to the logical boundaries of the archive records.

... With the present scheme,
it's not hard to find pmlogger-generated archives that PMAPI refuses
to open.  I've got a bunch here, whether resulting from an untimely
pmlogger exit or a system crash/reboot.  (Note that our own tools
sometimes SIGKILL an intransigent pmlogger.)

I'll bet SIGKILL has the highest probability. And the "truncated is not corrupted" change (that is independent of -u issues) would address many of these I would expect.

Can we agree that -u semantics and no stdio should be the default behaviour going forward?

And I'll do the "truncated is not corrupted" change as a separate work item.

<Prev in Thread] Current Thread [Next in Thread>