Hi all,
I'm in the midst of implementing the infrastructure which will allow
consistency checking among the archives in a multi-archive context.
As part of this, some things, like the PMNS tree and the pmid and
indom hash tables (in __pmLogCtl) are better off being global to all
the archives in the context. So far, so good with those changes.
I'm now looking at the time indices (l_numti and l_ti within
__pmLogCtl). From an internal point of view, these could easily
remain local to each archive within the context, since the index
records for each archive refer only to records within that same
archive. In fact, I already have an implementation of
__pmLogSetTime() which switches to the correct archive before
attempting to use any time indices.
On the other hand, using time indices which are global to the entire
context might simplify things and improve performance. Some trickery
would still be needed in order to associate each time index with the
correct archive.
Things get more complicated with the realization that a few tools
directly access the time index data structures. These tools are
pmlogrewrite(1), pmdumplog(1) and pmlogcheck(1). I was surprised by
this, but regardless of which way we go with the time indices, the
tools can be updated so as to continue working correctly. In fact,
this would be a good opportunity to change these tools to at least
use internal APIs rather than accessing the data structures
directly.
The problem lies with older versions of these tools attempting to
work with a multi-archive enabled libpcp. All of these tools access
the time index data structures in a way that makes both time index
solutions problematic. The fact that each of these tools iterates
over an array of __pmLogTI makes adding fields to that structure
impossible. That can be worked around. However, each tool also has
its own problems:
- pmlogdump(1):
This tool has an option to dump all of the time indices in the
context. It expects to see all of the time indices in the
context and to be able to relate each to the proper .meta and
metric data file.
- If we choose global time indices, older versions of this
tool will incorrectly associate all time indices with the name
of the current archive.
- If we choose local time indices, older versions of this tool
will completely miss any time indices which are not part of
the current archive
- pmlogcheck(1):
Part of the checking is iterate overall the time indices in the
current context looking for inconsistencies.
- If we choose local time indices, older versions of this tool
will completely miss any time indices which are not part of
the current archive
-
If we choose global time indices
- older versions of this tool will flag the
non-monotonically increasing volume numbers as the volume
reverts to zero for each new archive.
- The tools also uses stat to examine the .meta file and the
metric data files and would incorrectly only see the files
associated with the current context.
- pmlogrewrite(1):
This tool iterates through all of the time indices in the
current __pmLogCtl structure looking for a match with the
current metric record. If a match is found, it writes a time
index record to the output archive. This is the only one of the
three for which older versions of this tool would still work
properly regardless of which time index strategy we choose.
So using local time indices leaves older versions of two of the
tools broken by omission of results and using global time indices
leaves them broken via reporting of inaccurate and misleading
results. Under those conditions, I'm inclined to stick with the
currently implemented local time indices.
Thoughts? ideas?
Dave
|
|