pcp
[Top] [All Lists]

Re: [pcp] fibre adapter counters in pcp?

To: Robert verkerk <robert.verkerk@xxxxxxx>
Subject: Re: [pcp] fibre adapter counters in pcp?
From: Mark Goodwin <mgoodwin@xxxxxxxxxx>
Date: Wed, 18 Apr 2012 23:31:45 +1000
Cc: Nathan Scott <nathans@xxxxxxxxxx>, kenj@xxxxxxxxxxxxxxxx, Services <ds-tg@xxxxxxx>, pcp@xxxxxxxxxxx
In-reply-to: <4F8EB2BA.8040907@xxxxxxx>
References: <4F8D20CD.8020404@xxxxxxx> <4F8E094C.5090208@xxxxxxxxxx> <4F8E73E5.4070203@xxxxxxx> <CAAp5ZgNavBa3ZyOuNnAq+sn-yYDya9n1ysxurXaY0tC+6fNbXg@xxxxxxxxxxxxxx> <4F8EB2BA.8040907@xxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1
On 04/18/2012 10:25 PM, Robert verkerk wrote:
-r--r--r-- 1 root root 4096 Apr 11 18:03 dumped_frames
-r--r--r-- 1 root root 4096 Apr 11 18:03 error_frames
-r--r--r-- 1 root root 4096 Apr 11 18:03 fcp_control_requests
-r--r--r-- 1 root root 4096 Apr 11 18:03 fcp_input_megabytes
-r--r--r-- 1 root root 4096 Apr 11 18:03 fcp_input_requests
-r--r--r-- 1 root root 4096 Apr 11 18:03 fcp_output_megabytes
-r--r--r-- 1 root root 4096 Apr 11 18:03 fcp_output_requests
-r--r--r-- 1 root root 4096 Apr 11 18:03 invalid_crc_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 invalid_tx_word_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 link_failure_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 lip_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 loss_of_signal_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 loss_of_sync_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 nos_count
-r--r--r-- 1 root root 4096 Apr 11 18:03 prim_seq_protocol_err_count
--w------- 1 root root 4096 Apr 11 18:03 reset_statistics
-r--r--r-- 1 root root 4096 Apr 11 18:03 rx_frames
-r--r--r-- 1 root root 4096 Apr 11 18:03 rx_words
-r--r--r-- 1 root root 4096 Apr 11 18:03 seconds_since_last_reset
-r--r--r-- 1 root root 4096 Apr 11 18:03 tx_frames
-r--r--r-- 1 root root 4096 Apr 11 18:03 tx_words

Ok, those stats are for a Qlogic HBA with fairly recent firmware.
We can export {rx,tx}{frames,words} and some of the error stats.
We could also use seconds_since_last_reset with some of the error
stats (error_frames and invalid_*) to come up with a meaningful Bit
Error Rate (BER) statistic - going above some threshold BER would
indicate FC transport issues (good for the pmie tool to monitor).

I have internal access to a system with a similar Qlogic HBA.
We have a system with an Emulex/lpfc HBA too. The Emulex stats
are similar, but definitely not all the same names and some counters
may have different units or dimensions. So we'll need to treat
each vendor specially and generalize the exported metric names.
Other vendors such as LSI/Megaraid etc will have to remain initially
unsupported until we can get access to h/w.

Nathan (or was it Ken?) - last time we were looking at supporting FC
stats, we were undecided whether to do so by extending the Linux PMDA,
or develop a new PMDA. I'm for extending the Linux PMDA since these
stats are exported by the Linux kernel (scsi driver modules) ..
thoughts?

Regards
-- Mark

<Prev in Thread] Current Thread [Next in Thread>