pcp
[Top] [All Lists]

Re: [pcp] spreadsheet -> pcp archive tool

To: Greg Banks <gnb@xxxxxxxxxxx>
Subject: Re: [pcp] spreadsheet -> pcp archive tool
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri, 30 Jul 2010 21:24:55 +1000
Cc: "pcp@xxxxxxxxxxx" <pcp@xxxxxxxxxxx>
In-reply-to: <4C50FD14.5090502@xxxxxxxxxxx>
References: <1280108858.2956.3.camel@xxxxxxxxxxxxxxxx> <4C50FD14.5090502@xxxxxxxxxxx>
Reply-to: kenj@xxxxxxxxxxxxxxxx
On Thu, 2010-07-29 at 14:01 +1000, Greg Banks wrote: 
> Ken McDonell wrote:
> > Check out the attached man page.  This is done and working.
> >
> > I'll check this into my tree later today, but this is a heads up and
> > request for comments.
> >   
> Wow, I never thought I'd say this but...the first thing I can see wrong 
> with this manpage is that it *over*documents things.

Some people are hard to please ... 8^)>

> >        from the spreadsheet columns onto the PCP data model.  The file is
> >        written in XML (Version 1.0) and conforms to the syntax defined 
> > in the
> You don't need to list the XML version here.

OK

> >        them and will exit with an error message of the form
> >
> >        __pmLogNewFile: "blah.0" already exists, not over-written
> >
> 
> Really, this error message is self-evident.

OK

> > MAPPING CONFIGURATION
> >        The mapfile contains specifications in standard XML 1.0 format.
> For new XML-based formats you really ought to be providing a DTD 
> (they're easy, see http://csharpcomputing.com/XMLTutorial/Lesson8.htm 
> for what seems to be an adequate tutorial) or perhaps a schema (I've 
> never done one of these, but see http://www.w3.org/XML/Schema#dev ).

I feel like I'm recovering from a radical lobotomy after battling
timezones in the Perl/libc/PCP no man's land ... I need some time to
recover before launching into anything new, easy or not.  But I will
follow up on this suggestion.

> >        timezone  Set the source timezone in the PCP archive (the 
> > default is to
> >                  use UTC).  Example: timezone="+1100".
> I assume that all timezone name formats that work with tzset() work 
> here?  If so you should say so.

Nope.  I finally nutted this out in iostat2pcp, and +HHMM or -HHMM is
the only thing that is going to work here.  I've updated the code and
the documentation to reflect that.

> >        Thereafter follow one or more metric specifications of the form
> >        <metric>metricname</metric>.  The metric tag supports the following
> If I understand your design correctly, "Thereafter" is the wrong word: a 
> <sheet> element *contains* one or more <metric> elements.

OK, reworded (your understanding is correct).

> >        pmid      [...] If omitted, the PMID will be automatically assigned
> >                  by pmiAddMetric(3) and this would be the most common 
> > case.
> You don't need the words "and this would be the most common case".

OK [shrug].

> >        indom     Each metric may have one or more values associated 
> > with it.
> >                  If there is only ever one value, the metric is 
> > singular and
> >                  indom should be set to PM_INDOM_NULL which is the default
> >                  case when the indom attribute is omitted.
> This sentence could be a little easier to read.

Not sure how to improve this, other than removing the "only ever" words.

> > Otherwise indom
> >                  should be specified as 2 numbers separated by periods 
> > (.)  to
> >                  set the domain and ordinal fields of the Instance Domain.
> The other paragraphs around this one have a reference to another manpage 
> where e.g. the fields of a pmid are explained.  Something like that 
> would be useful at this point.

Unfortunately this is not documented in any man page ... best option is "see
the __pmInDom_int typedef in <pcp/impl.h>".

> >        The <datetime> element defines the column in which a date and 
> > time will
> >        be found to form the timestamp in the PCP archive for all the 
> > data in
> >        each row of the PCP archive.
> Ok, but how?  Column number?  0-based or 1-based?  Or are columns 
> matched 1-to-1 with <datatime>,<data> and <skip> elements in the order 
> seen in the file?  What happens if there are more columns than those 
> elements?

The order of elements in the XML file matches the column order in the
spreadsheet.  If the number of elements is not equal to the number of
columns, a warning is issued and the additional elements/columns produce
no data values in the output archive (this is now documented).



<Prev in Thread] Current Thread [Next in Thread>