On Tue, 20 Jun 2000 Cameron_C_Caffee@xxxxxxxxxxxxxxxxxx wrote:
> ...
> Regarding the nature of the archives making the processing of spanned archives
> difficult ...
>
> That's too bad. Other performance products are designed to provide for a
> "log-once - read-many" approach which does not require any re-processing of
> logs
> in order to select a particular range of dates/times for a particular host
> computer. After reviewing the man page for pmlogextract, I can agree that
> several logs for a given host can be re-processed to create a single archive
> file for analysis. The utility also offers an opportunity for data reduction
> through selection of a sub-set of metrics for inclusion in the output archive.
> Obviously, I'd prefer to avoid this type of re-processing to obtain the data
> desired when the archive file names and the content of the archives already
> communicate the information necessary to support the desired date/time
> selection
> criteria.
I think this is an operational issue to a large extent. If your normal
mode of processing invloves archives spanning long durations, then a
simple combination of pmlogger_check, pmlogger_daily, cron and
pmlogextract will allow you to stitch together logs of any desired
duration.
But there are lots of sites where a more useful operational model is
a collection of archives each spanning one day.
pmlogger_daily is biased towards the latter situation, based soley on
arguments of simplicity, i.e. it is easier to construct a weekly
archive from a set of daily archives, than to chop a weekly archive
into a set of daily archives.
> Regarding the question of multiple nodes ...
>
> I agree that it is a significant design consideration. However, it may not be
> too early for the project to start thinking about a design that will
> facilitate
> multi-node reporting. When one considers the evolving use of clustered
> machines,
> the reporting requirement for those environments is to reflect the over-all
> performance and capacity measurements for the cluster as a whole. If PCP is to
> be useful in those environments, it will have to accommodate this requirement.
As a historical aside, PCP is not "early in the project" ... many of
the architectural and key design decisions were made 7 years ago.
I think there may be a misunderstanding here. In my previous posting,
I was trying to say that the issues for multiple archives from
different hosts are different to the issues for multiple archives from
a single hosts.
There is nothing in the PCP approach that is _not_ designed for
monitoring multiple hosts ... quite the contrary, the whole
client-server architecture is biased towards arbitrary combinations of
monitors and systems being monitored. The same client can monitor
stats from multiple hosts (or multiple archives) concurrently. We
routinely see one system acting as the pmlogger farm for multiple
hosts, and monitoring tools watching multiple hosts concurrently.
pmie includes logical predicates that extend rules to accommodate
multiple hosts (see the some_hosts, all_hosts and N%_host aggregate
operators).
We have PCP PMDAs that span multiple nodes in a cluster (although I
admit none of these have escaped into the open source release as yet).
So, I think PCP is _really_ well placed to operate in environments with
lots of hosts.
> BTW: Does pmlogextract support a wild-carded input file specification ?
No, at first blush I'd say that is a function for the shell.
|