pcp
[Top] [All Lists]

Re: pcp updates - qa and pmlogmv

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: pcp updates - qa and pmlogmv
From: fche@xxxxxxxxxx (Frank Ch. Eigler)
Date: Tue, 25 Mar 2014 23:22:28 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <5331FF3C.20304@xxxxxxxxxxxxxxxx> (Ken McDonell's message of "Wed, 26 Mar 2014 09:12:12 +1100")
References: <5331FF3C.20304@xxxxxxxxxxxxxxxx>
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)
kenj wrote:

> pmlogmv is new -- deals with an itch, solves a production problem I
> had yesterday, and is needed for one of the agreed pmlogger_daily
> changes that is on my plate. [...]

Sounds interesting & piqued my interest after the multithreading matters
the other day.

> commit d1779bce78e82455c9c7ae29267168ba6ba1e875
> Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
> Date:   Wed Mar 26 09:01:57 2014 +1100
>
>     pmlogmv (new) - atomic move/rename PCP archive files
>     
>     New shell script to atomically move (rename) all the physical
>     files in a PCP archive.  Aside from direct use, the justification
>     is to enable an optimization for the common "one archive case" in
>     pmlogger_daily.

Could you elaborate what kind of atomicity this is intended to guarantee?
The man page says:

       Because PCP archives are important records of system activity,  special
       care  is  taken  to ensure the integrity of an archive's files.  Should
       any problem be encountered during the execution  of  pmlogmv,  all  the
       files  associated with oldname will be preserved, and no new files with
       the newname prefix will be created.  In the event of a system crash, at
       least one of oldname or newname will be preserved.

One might reword the last and second-last sentences to indicate that a
system crash is not an example of "any problem", so weaker guarantees
apply.

Looking at the script itself, it's not clear it can deliver even the
first guarantee as is.  Maybe the most clear-cut problem is in the
unlink stage at the end, wherein stat(1) Links:-count is used to
validate the state for each input file.  But it appears subject to the
common TOCTTOU vulnerability: a race between the time link-counts are
checked and the results used.  If multiple pmlogmv scripts run
concurrently with the same inputs, the results are indeterminate.

Another scenario is if a detected error occurs during the unlink loop,
it appears possible for some files to disappear: say the first two
files from $tmp.old were successfully handled (including the unlink of
the $old name), but then a failure occurred for the third.  At this
point, _cleanup nukes all the $tmp.new files, leaving no trace of the
first two at all.  You might need to track a more accurate
transaction-log or undo-log type data, not just a $tmp.old / $tmp.new
lists to correct this part.

(There might be other problems, these are just two that jumped out.)


- FChE

<Prev in Thread] Current Thread [Next in Thread>