Hi -
> Have just come across commit 5a3a2494 in fche/dev, and it seems
> I was too late sending this mail as the following has already
> been attempted.
Yup.
> If only the missing QA for pmmgr was tackled so vigorously! ;) -- I
> would be so happy.
The anticipation will be worth it. ;-) Sorry it's been taking so long.
> ["logging durability: add fsync(2) when closing log archives"]
>
> Can you describe the testing done for this change or planned?
The pcpqa suite, plus strace examination that the fsync(2)'s are going
out at the right time. There are still some known problems.
> Also, thoughts on the impact the newly introduced latency will have
> on these code paths and the tools/scripts using them? [...] Note
> also the mentions she makes of performance implications, important
> for us folks attempting to minimise our impact on the observed
> system.
This one's tricky to measure well. fsync does not increase I/O
amount, only compresses it along the time axis. The latency suffered
by the archive-producing tool exits vary by the speed/volume of their
output generation: batch pmlog* manipulation tools will be affected
far more than pmlogger. I'll try to measure this by hand.
> There's also other subtleties discussed on the fsync(1) man page,
> its well worth a careful read too - the directory paragraph is
> relevant to our needs.
Yup. We're a long way from the sort of robustness guarantees we might
like to have. This was just "low-hanging fruit" as they say, and
should mostly solve one particular problem you had encountered.
- FChE
|