Hi -
> > [...] AFAIK, none of our tools append to existing archives, so
> > I'm not sure what kind of multi-write-pass processing you are
> > referring to.
> Yes, none of the tools do this - twas a general API comment, that we
> should not assume these things have never been done (from some other
> API user) or will never be done (by either others or ourselves).
> [...]
> There's no guarantee of that though, and an API should not force it
> to be that way. Client tools are free to open & close multiple
> times [...]
The API does not support opening an existing archive for
writing/appending. None of our tools do it. We haven't heard of
tools that do it, never mind hearing that they would be unacceptably
affected by fsync-on-close. Let's not worry about it.
The day the API is extended and/or suchly-troubled tools appear, we
can fit in flushing-control parameters. Or even then, the
PCP_NO_FSYNC environment variable (coming soon to fche/dev) may
satisfy them.
> > [...] I suggest defaulting to greater data-safety rather than
> > less. This is the same default used for modern text editors, git,
> > and of course databases of all kinds: they all fsync on close by
> > default.
>
> These are not performance analysis tools, which one might be using
> to study the effects of fsync (or disk access patterns in general,
> that could be disrupted by increased sync activity).
If a system is hypersensitive to I/O, it should not be running a
pmlogger at all, and should be remotely logged instead.
If a sysadmin is trying to analyze pmlogger's own impact, then
including any data-safety measures in that impact is entirely
appropriate.
> > As to "for daily log rotation", are you suggesting that pmlogger's own
> > fresh/original output (which has the lowest write volume/rate, thus
> > the lowest cost for fsync's)
>
> ("thus the lowest cost" - this is a bit of an oversimplification
> FWIW, and is not necessarily an accurate assertion - see below).
I didn't see any contradiction. If you're referring to a pmlogger
many-samples-per-second scenario, it would have to be something that:
a) runs pmlogger at a high data rate (contrast with the normal default
of some 8000 bytes once a minute, something even a floppy disk can
hack)
b) stores pmlogger data locally (or else disk i/o issues are
irrelevant), but any background disk flushing traffic is not a
problem during this critical performance period
c) considers the CPU/network load imposed by the high-data-rate
polling of local pmcd as not a problem either during this critical
performance period
d) causes a large quantity of pmlogger output to be buffered by the
kernel (or else fsyncing it wouldn't be a problem), but this memory
consumption is not a problem either during this performance
critical period
e) must quit pmlogger during the critical performance period, so as to
trigger the controversial fsync-on-close. (Note again that no one
here has advocated having pmlogger issue an fsync for every log
record write.)
I don't see how all of a..e can hold in a sensible situation.
- FChE
|