Comment # 4
on bug 1046
from Ken McDonell
Frank, thanks for the investigation.
Your plan matches what I expected would be involved.
> - change lipcp/src/logmeta.c addindom and/or searchindom to merge
> rather than replace new instlist/namelist entries
Are you suggesting reconstructing the temporal series of indom sets as per the
current format? I have always been concerned that this is a huge VM soak for
any application that opens an archive with a big, dynamic indom. If we're
going to attack this area, perhaps we should consider a more efficient data
structure and processing algorithms ... certainly lazy instantiation is an
option and because this data is often not used at all (the application is
interested in some _other_ metric from the archive) or the common use involves
mapping internal instance ids to external instance names at some point in time.
> - change pmlogger/src/callback.c log_callback to handle the needindom=1
> case's numval^2 search with more finesse, that is to identify missing
> inst#'s individually, and emit a smaller incremental __pmLogPutInDom.
>
> (We are assuming that instance strings never change. That's not quite
> correct: proc.* pid<->name strings should vary as processes exec(),
> but the linux_proc pmda happens not to track that.)
Since you need to identify new instances and dropped instances, I am not sure
this can be done better than a 2*O(numval) algorithm. I don't follow the
reference to strings ... it is the internal instance id's that matter so I
think this should be a search and mark operation over a set of integer keys.
And I'm with Nathan, bump the archive version number to V.3 to accommodate the
migration.