Hello,
yes, that is all very promising, especially the idea of script plugins.
I have experienced such a data work-flow in real world:
1. the values are acquired and sampled and timestamped by some data
acquisition system
2. then it is writen in some form by the utility control system
and used online by the same system for the control
3. data is exported ex post or on the fly to some kind of archive for
further analysis
4. data filtering (preparation for analysis) is a MUST, here begins
the human work with some extent of automation
I consider step 3. as "raw" data, even if it is in csv it is still raw,
it contains white holes where data is missing or are shifted by some
way.
So the analyst MUST do some perl or awk or even bash or excel or hand
tweaking, to prepare time series to acceptable filtered data ready to
next step -
5. data import, whatever tool it is - Statistica, R, Octave, Gnuplot,
Matlab and possibly PCP in our case.
Typically the users of R/Octave/Gnuplot bundle do not now that PCP even
exist, but that does not mean it has no value for analysis.
The value is big and any analyst of industrial data knows how it is
important to have possibility to replay a little bit faster some time
windows of data and compare with other data.
Also pmie could trigger proactive analysis, etc. I see lot of
possibilities by using PCP with external timeseries.
Shortly, I want to say
- scripted plugin (or perl binding) is much more better because point
4., scripted filters are a MUST while treating raw data
- real world examples would be many (if PCP is known more)
- timeseries best way to exchange is ascii file in sort of csv table,
so if there is ready-to-go csv plugin, which requires just to prepare
headers, lot of users will start just by this
Best regards,
Petr
PS: You all are doing nice work, thank you.
PPS: I have even some ideas how to do pmchart more informing,
so if you want I will write it down.
PPPS: If you need to send more samples, let me know.
On Wed, 2010-06-23 at 08:52 +1000, nathans@xxxxxxxxxx wrote:
> ----- "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx> wrote:
>
> > Petr,
> >
> > I'm promoting this discussion to the PCP list, as there may be others
> > with an interest in the topic.
>
> *nod*, yep, very interested.
>
> > I've attached the two man pages for the parts of pmimport (which have
> > been released by SGI, but are not in the PCP tree on oss.sgi.com yet,
> > pending a decision on what to do with pmimport).
>
> We should put them in the tree, just not installed (like the code).
>
> > If there was consensus on exactly how this meta data was to be
> > encoded in the file, then it would be possible to write a generic
> > "csv" plugin for pmimport.
>
> The plan I had in mind for the pmimport tool and API was to make
> a Perl wrapper around a plugin API so that script plugins could
> be written. With that kind of approach we'd be giving punters
> access to CPAN for their data import situations (databases, a CSV
> parser that handles the corner cases, any date formats, and even
> direct spreadsheet parsing) and it's alot simpler and quicker to
> write scripts than C code.
>
> Then an example Perl script or two for importing specific formats
> (using http://search.cpan.org/~hmbrand/DBD-CSV-0.29/lib/DBD/CSV.pm
> for CSV, perhaps, to give a DBD example importer - and/or using
> http://search.cpan.org/~hmbrand/Spreadsheet-Read-0.40/Read.pm for
> a direct spreadsheet->pcp archive export example).
>
> The PCP::PMDA module would be a good starting point - it wraps up
> the libpcp_pmda API, in much the same way as we'd want to capture
> the pmimport plugin API. Some pmimport code refactoring would be
> needed to make it more amenable to use from scripts though (perhaps
> the pmimport binary should become a libpcp_import with a Perl
> front-end driver script, which can use either shared lib or script
> plugins, for example) ... its a fair bit of work, but I think the
> end result would be alot better than just a CSV parser.
>
> Also, if libpcp_import refactoring done specifically for scripts,
> it should be relatively straightfwd to do Python bindings, etc.
>
> > > ...
> > > I have attached the file which is actually OpenOffice calc
> > > export to csv
> > > of such a real-world data which comes from industrial device.
> > > Most of the variables there are temperatures and pressures.
>
> This attachment got lost in the forwarding (guess its not needed
> really, you gave the gist of it) - be nice to not have to export
> to CSV and then export again to PCP (possibly with hand-tweaking
> in-between), but rather go directly with the Spreadsheet::Read &
> a new PCP::LogImport module!
>
> cheers.
>
--
PM
|