Hi Ken,
On Wed, Apr 30, 2014 at 07:28:00AM +1000, Ken McDonell wrote:
> On 30/04/14 05:28, Frank Ch. Eigler wrote:
> >>[...]
> >>I am in the process of adding support for PCP in sosreport[1]. [...]
> >>/var/log/pcp, /var/lib/pcp/config, /etc/pcp,
> >>/etc/pcp.conf, /etc/pcp.env, /etc/pcp.sh
> >
> >Note that /etc/pcp.conf is an input to .env / .sh; the latter are just
> >non-adustable shell scripts. The variables set inside /etc/pcp.conf
> >can redirect the location of many other bits. For example, instead of
> >hardcoding the /var/log/pcp directory name, sosreport -might- consider
> >getting the PCP_LOG_DIR value out of /etc/pcp.conf.
>
> Further to Frank's suggestions ...
>
> 1. /etc/pcp.env is not that useful to collect
> 2. use /etc/pcp.conf (or source /etc/pcp.env) in your collector script to
> drive the inventory of collection artifacts, e.g. use $PCP_SYSCONF_DIR
> instead of /etc/pcp and use $PCP_LOG_DIR instead of /var/log/pcp and use
> $PCP_VAR_DIR/config instead of /var/lib/pcp/config
> 3. also of value would be the contents of the $PCP_VAR_DIR/pmns directory
> 4. /etc/pcp.sh is probably not useful in this context
> 5. if you're interested in current state, capturing the output from the
> pcp(1) command would be useful
> 6. in $PCP_LOG_DIR, always collect the contents of the pmcd subdirectory and
> the NOTICES* files
Thanks. I've added 5 and 6.
> >>[...] I'd imagine that extra carefulness needs to be taken for
> >>/var/log/pcp in order to avoid collecting stuff (logger data?) that
> >>is bigger than X unless explicitely asked for. I assume they can
> >>grow moderately big, although I don't have any real-world data on
> >>that.
> >
> >The bulk archives (*.[0-9]*, .meta, .index files) certainly grow big:
> >10-20 MB per day per host, kept by default for 14 days. It can blow
> >up multiplicatively for longer-than-default or multiple-host
> >logging. OTOH, the files are highly (90%+) compressible, and provide
> >a good detailed performance overview of the host(s).
>
> If you're able to able to use heuristics to refine the selection here (there
> are PCP archives below the $PCP_LOG_DIR/pmlogger and $PCP_LOG_DIR/pmmgr
> directories), ...
I assume that for the sosreport use case $PCP_LOG_DIR/pmmgr is less
interesting as it is mainly to collect data from a bunch of hosts? Or
is it commonly used to collect from the host as well?
> a. pick all the *.log files
> b. restrict the subdirectories to names that match hostname(1) (the others
> are remote machines, and less useful for sosreport I presume)
Good point, it makes little sense to capture anything that does not
belong to the host itself for the sosreport case. I will add this.
> c. there is basically a currently being written archive and then archives
> for the last N days ... the current and yesterday is probably most useful if
> space becomes an issue (name matches on a pattern like *DDDDMMYY* will work,
> pmdate -1d '%Y%m%d' is helpful here to get yesterday's date pattern, else
> used find ... -mtime -1)
Good point, I might add something like the following default behaviour:
if total size of dir < threshold collect everything otherwise just the last two
days
> If you need someone to review / check a collection or exercise your changes
> just let us know.
Thanks, I'll likely get back to you on this offer ;)
Kind regards,
Michele
--
Michele Baldessari <michele@xxxxxxxxxx>
C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D
|