pcp
[Top] [All Lists]

Re: [pcp] pmrep: add xlsx support

To: pcp developers <pcp@xxxxxxxxxxx>
Subject: Re: [pcp] pmrep: add xlsx support
From: Marko Myllynen <myllynen@xxxxxxxxxx>
Date: Fri, 18 Dec 2015 12:42:00 +0200
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <550451752.42654491.1450307652676.JavaMail.zimbra@xxxxxxxxxx>
Organization: Red Hat
References: <5670184B.7000603@xxxxxxxxxx> <231833641.41636789.1450216493040.JavaMail.zimbra@xxxxxxxxxx> <56712D21.4090905@xxxxxxxxxx> <550451752.42654491.1450307652676.JavaMail.zimbra@xxxxxxxxxx>
Reply-to: myllynen@xxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0
Hi,

On 2015-12-17 01:14, Nathan Scott wrote:
> ----- Original Message -----
>> [...]
>> output to match PCP not sar2xls convention). A warning is printed but
>> the generated archive looks to be just fine:
>>
>> Use of uninitialized value in subroutine entry at sheet2pcp line 571.
> 
> Hmm, that line seems to be the closing brace of a for loop, so not
> clear which variable that warning is referring to.  :(
> 
>> Since the actively developed Python XLSX modules might occasionally
>> change their output (e.g. a white-space or a bug fix change) then a
>> diff(1) based comparison might cause false alarms too often, this
>> sheet2pcp(1) test would actually sound like a good idea, we'd be testing
>> sheet2pcp(1) more as well in the process.
> 
> Yep - win, win.

below is updated xlsx support patch against the current git code; this
doesn't yet change anything regarding the import situation but with
this update we now have full pmrep-csv/sheet2pcp/pmrep-xlsx/sheet2pcp
roundtrip compatability.

The mapping file with sheet2pcp I used with the xlsx was something like:

<?xml version="1.0" encoding="UTF-8"?>
<sheet  heading="3" timezone="...." datefmt="YMD">
<metric ...>...</metric>
...
<datetime></datetime>
<data>...</data>
...
</sheet>

(With CSV it's the same except for heading="1".)

---
 src/pmrep/TODO     |   1 -
 src/pmrep/pmrep.1  |  10 +++++-
 src/pmrep/pmrep.py | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 109 insertions(+), 5 deletions(-)

diff --git a/src/pmrep/TODO b/src/pmrep/TODO
index e803386..519c4f6 100644
--- a/src/pmrep/TODO
+++ b/src/pmrep/TODO
@@ -14,7 +14,6 @@
 -  verify CSV format
 -  add JSON output support
 -  add XML output support (like sadf)
--  add XLS output support (like sar2xls)
 -  look for config in ./, ~/.pcp, ~/, /etc/pcp or so
 -  possibly add more command line switch sanity checking
 -  modularize code to allow creating custom output plugins
diff --git a/src/pmrep/pmrep.1 b/src/pmrep/pmrep.1
index 2e02602..f9c7b6f 100644
--- a/src/pmrep/pmrep.1
+++ b/src/pmrep/pmrep.1
@@ -287,7 +287,7 @@ string for formatting the timestamp.
 The format will be used with
 Python's
 .B datetime.strftime
-method which is mostly the same as that described in
+method (except with xlsx) which is mostly the same as that described in
 .BR strftime (3).
 An empty
 .I format
@@ -296,6 +296,8 @@ The default with stdout is
 .BR %H:%M:%S .
 The default with csv is
 .BR "%Y-%m-%d %H:%M:%S" .
+The default with xlsx is
+.BR "yyyy-mm-dd hh:mm:ss" .
 .TP
 .B \-F
 Specify the output file
@@ -381,6 +383,11 @@ Print metrics in CSV format (subject to formatting 
options).
 .I stdout
 Print metrics to stdout (format subject to formatting options).
 .TP 2
+.I xlsx
+Record metrics into an XLSX file for office/spreadsheet suites.
+Uses the \fBxlsxwriter\fR module. Requires
+.BR \-F .
+.TP 2
 .I zabbix
 Send metrics to a Zabbix server.
 See
@@ -704,6 +711,7 @@ as described in
 .BR pmstat (1),
 .BR pmval (1),
 .BR sar (1),
+.BR sheet2pcp (1),
 .BR __pmSpecLocalPMDA (3),
 .BR pmLoadDerivedConfig (3),
 .BR pmParseUnitsStr (3),
diff --git a/src/pmrep/pmrep.py b/src/pmrep/pmrep.py
index d262663..710e7df 100644
--- a/src/pmrep/pmrep.py
+++ b/src/pmrep/pmrep.py
@@ -52,6 +52,10 @@ try:
     import json
 except:
     import simplejson as json
+try:
+    import xlsxwriter
+except:
+    pass
 import socket
 import struct
 import time
@@ -76,6 +80,7 @@ CSVSEP  = ","
 CSVTIME = "%Y-%m-%d %H:%M:%S"
 OUTSEP  = "  "
 OUTTIME = "%H:%M:%S"
+XLSTIME = "yyyy-mm-dd hh:mm:ss"
 ZBXPORT = 10051
 ZBXPRFX = "pcp."
 NO_VAL  = "N/A"
@@ -86,6 +91,7 @@ VERSION = 1
 OUTPUT_ARCHIVE = "archive"
 OUTPUT_CSV     = "csv"
 OUTPUT_STDOUT  = "stdout"
+OUTPUT_XLS     = "xlsx"
 OUTPUT_ZABBIX  = "zabbix"
 
 class ZabbixMetric(object):
@@ -190,6 +196,7 @@ class PMReporter(object):
         self.outfile = None
         self.writer = None
         self.pmi = None
+        self.sheet = None
         self.derived = None
         self.header = 1
         self.unitinfo = 1
@@ -319,7 +326,7 @@ class PMReporter(object):
         opts.pmSetLongOptionSpecLocal()    # -K/--spec-local
         opts.pmSetLongOption("config", 1, "c", "FILE", "config file path")
         opts.pmSetLongOption("check", 0, "C", "", "check config and metrics 
and exit")
-        opts.pmSetLongOption("output", 1, "o", "OUTPUT", "output target: 
archive, csv, stdout (default), or zabbix")
+        opts.pmSetLongOption("output", 1, "o", "OUTPUT", "output target: 
archive, csv, stdout (default), xlsx, or zabbix")
         opts.pmSetLongOption("output-file", 1, "F", "OUTFILE", "output file")
         opts.pmSetLongOption("derived", 1, "e", "FILE|DFNT", "derived metrics 
definitions")
         #opts.pmSetLongOptionGuiMode()     # -g/--guimode # RHBZ#1289910
@@ -374,6 +381,8 @@ class PMReporter(object):
                 self.output = OUTPUT_CSV
             elif optarg == OUTPUT_STDOUT:
                 self.output = OUTPUT_STDOUT
+            elif optarg == OUTPUT_XLS:
+                self.output = OUTPUT_XLS
             elif optarg == OUTPUT_ZABBIX:
                 self.output = OUTPUT_ZABBIX
             else:
@@ -580,7 +589,15 @@ class PMReporter(object):
             self.source = "@" # PCPIntro(1), RHBZ#1289911
 
         if self.output == OUTPUT_ARCHIVE and not self.outfile:
-            sys.stderr.write("Archive must be defined with archive output.\n")
+            sys.stderr.write("Outfile must be defined with archive output.\n")
+            sys.exit(1)
+
+        if self.output == OUTPUT_XLS and "xlsxwriter" not in sys.modules:
+            sys.stderr.write("Failed to load xlsxwriter module.\n")
+            sys.exit(1)
+
+        if self.output == OUTPUT_XLS and not self.outfile:
+            sys.stderr.write("Output file must be defined with archive 
output.\n")
             sys.exit(1)
 
         if self.output == OUTPUT_ZABBIX and (not self.zabbix_server or \
@@ -796,6 +813,8 @@ class PMReporter(object):
         if self.timefmt == None:
             if self.output == OUTPUT_CSV:
                 self.timefmt = CSVTIME
+            elif self.output == OUTPUT_XLS:
+                self.timefmt = XLSTIME
             else:
                 self.timefmt = OUTTIME
         if not self.timefmt:
@@ -962,6 +981,7 @@ class PMReporter(object):
 
     def report(self, tstamp, values):
         """ Report the metric values """
+        dt = None
         if tstamp != None:
             ts = self.context.pmLocaltime(tstamp.tv_sec)
             us = int(tstamp.tv_usec)
@@ -975,6 +995,8 @@ class PMReporter(object):
             self.write_csv(tstamp, values)
         if self.output == OUTPUT_STDOUT:
             self.write_stdout(tstamp, values)
+        if self.output == OUTPUT_XLS:
+            self.write_xlsx(dt, values)
         if self.output == OUTPUT_ZABBIX:
             self.write_zabbix(tstamp, values)
 
@@ -982,6 +1004,7 @@ class PMReporter(object):
         """ Prepare generic stdout writer """
         if not self.writer:
             if self.output == OUTPUT_ARCHIVE or \
+               self.output == OUTPUT_XLS or \
                self.output == OUTPUT_ZABBIX or \
                self.outfile == None:
                 self.writer = sys.stdout
@@ -1083,7 +1106,7 @@ class PMReporter(object):
 
     def write_header(self):
         """ Write metrics header """
-        if self.output == OUTPUT_ARCHIVE:
+        if self.output == OUTPUT_ARCHIVE or self.output == OUTPUT_XLS:
             self.writer.write("Recording %d metrics to %s" % (len(self.pmids), 
self.outfile))
             if self.runtime != -1:
                 self.writer.write(":\n%s samples(s) with %.1f sec interval ~ 
%d sec duration.\n" % (self.samples, float(self.interval), self.runtime))
@@ -1304,6 +1327,77 @@ class PMReporter(object):
         nfmt = nfmt[:-l]
         self.writer.write(nfmt.format(*tuple(line)) + "\n")
 
+    def write_xlsx(self, timestamp, values):
+        """ Write results in XLSX format """
+        if timestamp == None and values == None:
+            # Complete and close
+            self.sheet.close()
+            self.sheet = None
+            return
+
+        # Current row
+        try:
+            self.row += 1
+        except:
+            self.row = 0
+
+        # Create the file and write the header
+        if not self.sheet:
+            col = 0
+            self.sheet = xlsxwriter.Workbook(self.outfile, {'constant_memory': 
True, 'in_memory': False, 'default_date_format': self.timefmt})
+            self.ws = self.sheet.add_worksheet("pmrep data")
+            fmt = self.sheet.add_format({'bold': True})
+            fmt.set_align('right')
+            self.ws.set_column(col, col, 20)
+            self.ws.write_string(self.row, col, "Time", fmt)
+            # Labels
+            for i, metric in enumerate(self.metrics):
+                ins = 1 if self.insts[i][0][0] == PM_IN_NULL else 
len(self.insts[i][0])
+                for j in range(ins):
+                    col += 1
+                    token = metric
+                    if self.insts[i][1][j]:
+                        token += "[" + str(self.insts[i][1][j]) + "]"
+                    self.ws.write_string(self.row, col, token, fmt)
+                    l = len(token) if self.metrics[metric][4] < len(token) 
else self.metrics[metric][4]
+                    self.ws.set_column(col, col, l + 5)
+            self.row += 1
+            # Units
+            col = 0
+            for i, metric in enumerate(self.metrics):
+                ins = 1 if self.insts[i][0][0] == PM_IN_NULL else 
len(self.insts[i][0])
+                for j in range(ins):
+                    col += 1
+                    unit = self.metrics[metric][2][0]
+                    self.ws.write_string(self.row, col, unit, fmt)
+            self.row += 1
+            # Add an empty line for readability
+            col = 0
+            fmt = self.sheet.add_format()
+            fmt.set_top(2)
+            self.ws.write_blank(self.row, col, None, fmt)
+            for i, metric in enumerate(self.metrics):
+                ins = 1 if self.insts[i][0][0] == PM_IN_NULL else 
len(self.insts[i][0])
+                for j in range(ins):
+                    col += 1
+                    self.ws.write_blank(self.row, col, None, fmt)
+            self.row += 1
+
+        # Add current values
+        col = 0
+        self.ws.write_datetime(self.row, col, timestamp)
+        for i, metric in enumerate(self.metrics):
+            ins = 1 if self.insts[i][0][0] == PM_IN_NULL else 
len(self.insts[i][0])
+            for j in range(ins):
+                col += 1
+                if str(list(values[i])[j][2]) != NO_VAL:
+                    if self.descs[i].contents.type == PM_TYPE_STRING:
+                        self.ws.write_string(self.row, col, 
str(values[i][j][2]))
+                    else:
+                        self.ws.write_number(self.row, col, 
list(values[i])[j][2])
+                else:
+                    self.ws.write_blank(self.row, col, None)
+
     def write_zabbix(self, timestamp, values):
         """ Write (send) metrics to a Zabbix server """
         if timestamp == None and values == None:
@@ -1345,6 +1439,9 @@ class PMReporter(object):
         if self.pmi:
             self.pmi.pmiEnd()
             self.pmi = None
+        if self.sheet:
+            self.sheet.close()
+            self.sheet = None
 
 if __name__ == '__main__':
     try:

Thanks,

-- 
Marko Myllynen

<Prev in Thread] Current Thread [Next in Thread>