Hi Ken,
----- Original Message -----
> On 14/04/14 10:49, Nathan Scott wrote:
> > ...
> > There is a metadata vs data ordering assumption, I believe. We need to
> > fsync the newly created data files before the rename
> [...]
>
> Nathan,
>
> So before exit(), you're sugesting fsync() on each of the data files,
> and I think I need fsync() on the container directory as well.
Just before the renames, we need to ensure the new data is on the platter.
So, fsync the new files before renaming, and we're done. We do not need
to fsync again, later, on exit - this is simply unnecessary latency and
doesn't fix the problem.
> But we should be consistent. If this is the "right" way to do it then
> surely all applications that can write PCP archives should do the same
> thing.
It is only this rename-over-the-top situation that requires a preceding
fsync to close that complete-data-loss gap.
> I am not against doing this, although if one was concerned at this level
> then I suspect an option to enforce O_SYNC might be better to guarantee
> on disk for all writes, not just flushing everything at exit, but we
> should choose one policy for writing PCP archives and implement it
> consistently throughout the PCP ecosystem.
O_SYNC? Yikes, I prefer not - there will be widespread introduction of
latency on write; and preferably not just-fsync-everything-always-to-be-sure
either. These approaches introduce potentially very large delays, in new
and sometimes highly undesirable places. And it'd be all so unnecessary.
Worse, they *still* leave the window for total data loss (although of course
they drastically diminish the time window it can happen).
Everything is fine as-is, except for this rename situation which has a
small potential-total-loss time window without fsync. A surgical incision
is what we need here, not a shotgun.
cheers.
--
Nathan
|