pcp
[Top] [All Lists]

Re: [pcp] PCP Network Latency PMDA

To: David Smith <dsmith@xxxxxxxxxx>, "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: [pcp] PCP Network Latency PMDA
From: William Cohen <wcohen@xxxxxxxxxx>
Date: Wed, 13 Aug 2014 14:37:23 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <53EB9262.104@xxxxxxxxxx>
References: <53A34A47.3060008@xxxxxxxxxx> <y0mk38caa4t.fsf@xxxxxxxx> <53A353C8.8030704@xxxxxxxxxx> <53A35C00.1070703@xxxxxxxxxx> <53EA6650.6040500@xxxxxxxxxx> <53EB7A09.7070503@xxxxxxxxxx> <53EB9262.104@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0
On 08/13/2014 12:29 PM, William Cohen wrote:
> On 08/13/2014 10:45 AM, David Smith wrote:
>> On 08/12/2014 02:09 PM, William Cohen wrote:
>>> Hi All,
>>>
>>> I have been experimenting with the systmetap mmv support David Smith
>>> has been developing in the dsmith/mmv branch of systemtap.  The
>>> attached script is a work-in-progress to measure the amount of time
>>> packets take from getting placed in the queue to the time that the are
>>> actually transmitted.  The list of network devices needs to be passed
>>> into the systemtap script so that it can set up instances for each of
>>> the network devices.  There is probably a better way of getting the
>>> device names to the script, but that is done to keep things simple.
>>>
>>> The script creates two metrics for each device: number of packets and
>>> the sum of the latency.  The systemtap script provides these basic
>>> metrics from the start so that the pmda could compute various rate
>>> on different times scale easily using the following formula for a device
>>> instance at two times (t1 and t2):
>>>
>>> current_latency/packet = (latency[t2]-latency[t1])/(count[t2]-count[t1])
>>>
>>>
>>> Rather than looking up the value identifies with mmv_lookup_value()
>>> the value identifiers are cached in a couple of associative arrays.
>>> However, I don't know whether the overhead is a real issue with that.
>>>
>>> Next step will be to write the mmv pmda to read this information and
>>> make it available.
>>>
>>> Any comments about the current code would be appreciated.
>>
>> Sigh.
>>
>> I've looked at your script. There is nothing really wrong with it in
>> itself, I'm now just wondering about the whole approach. The whole mmv
>> interface looks clunky when expressed this way in systemtap. Once we get
>> past the setup stuff in the begin probe, the update logic in do_update()
>> is just sad. (Once again this isn't really a critique of your script,
>> just the interface itself.) The way your script is forced to do updates
>> doesn't really work - the only way this stuff will work as designed is
>> that you actually keep the values in metric values, not copy stuff from
>> systemtap variables into mmv values. Otherwise we're increasing the
>> problem of inconsistent values between metric values.
> 
> Hi David,
> 
> I had the same feeling that there seemed to be more setup to get mmv mapping 
> for systemtap that desired.  The copying of the data from systemtap 
> associative arrays to the mmv was done to allow the use of the "<<<" operator 
>  Writing directly into the metrics to eliminate the periodic timer update 
> seems like a cleaner approach.  The attached version does that now.  However, 
> the mmv_lookup_value fails if more than one network device is passed in as an 
> argument.
> 
> One major difference between userspace mmv and the systemtap mmv is that much 
> of the setup in userspace mmv is just done by static creation of structures.  
> This is a pretty compact format.  In systemtap there needs to be multiple 
> calls to create similar data structures and these get pretty verbose.
> 
> Another issue I had when creating the script was the relative static creation 
> of the mmv.  Systemtap scripts often don't have an idea of what devices are 
> used when the script runs and the script just makes a note in an associative 
> array of each device observed during the run.  That enumeration of possible 
> values at startup is something that systemtap scripts usually don't do.
> 
>>
>> With enough work, we could add some systemtap translator support and
>> make this less clunky. However, I'm not sure that still solves the real
>> problem here. This real script (as opposed to the fake script I've been
>> using to develop) has really pointed out the problems here.
>>
>> Perhaps I'm being too pessimistic here.
>>
>> If I was looking at what you are really trying to do here, here's a
>> couple of ideas:
>>
>> - Use systemtap's procfs interface and write/modify a PCP PMDA to
>> collect data from it. The advantage here is that PCP can ask to read the
>> data at whatever interval it wants (1 second, 5 seconds, 1 hour, etc.)
>> and the data wouldn't be computed until then (and the data would always
>> be consistent).
> 
> The procfs approach was what I was thinking previously.  The updates to the 
> mmv fields in the systemtap script are non-atomic and there is the potential 
> to have new and  values that wouldn't exist in the procfs approach.  I guess 
> one of the problems in the past with groking proc information was the rather 
> free form text.  Maybe the systemtap procfs could output data in the json, 
> xml, or some other easily machine digestible format.
> 
>>
>> - A full-blown event exporter (JSON?) from systemtap. I believe Nathan
>> has outlined this in the past.
>>
> 
> -Will

The script doesn't seem to work with multiple instances, but it does work for a 
single network device.  In one window started the systemtap script:

$ stap -k -v -m mmmm ../../net_xmit_mmv.stp em1
Pass 1: parsed user script and 171 library script(s) using 
412408virt/228244res/6364shr/225120data kb, in 920usr/50sys/984real ms.
Pass 2: analyzed script: 7 probe(s), 12 function(s), 2 embed(s), 56 global(s) 
using 426424virt/243628res/7528shr/239136data kb, in 210usr/90sys/297real ms.
Pass 3: translated to C into "/tmp/stapbxXaao/mmmm_src.c" using 
426424virt/243628res/7528shr/239136data kb, in 10usr/0sys/2real ms.
Pass 4: compiled C into "mmmm.ko" in 1880usr/240sys/1937real ms.
Pass 5: starting run.
em1 0
argv[1] = em1, inst[em1] = 0
instance_count[em1] = 0
instance_latency[em1] = 1

Then in another window ran mmvdump towards the bottom of the output see the 
xmit_count and xmit_latency:

$ ./mmvdump /proc/systemtap/mmmm/mmv 
MMV file   = /proc/systemtap/mmmm/mmv
Version    = 1
Generated  = 854907
TOC count  = 5
Cluster    = 43
Process    = 0
Flags      = 0x0

TOC[0]: offset 40, indoms offset 1704 (1 entries)
  [1/1704] 1 instances, starting at offset 1736
       shorttext=xmit device
       helptext=list of network transmit devices

TOC[1]: offset 56, instances offset 1736 (1 entries)
  [1/1736] instance = [0 or "em1"]

TOC[2]: toc offset 72, metrics offset 1816 (2 entries)
  [1/1816] xmit_count
       type=64-bit int (0x2), sem=counter (0x1), pad=0x0
       units=count
       indom=1
       shorttext=xmit count metric
       helptext=number of packets for xmit device
  [2/1920] xmit_latency
       type=64-bit int (0x2), sem=counter (0x1), pad=0x0
       units=nanosec
       indom=1
       shorttext=xmit latency metric
       helptext=sum of latency for xmit device

TOC[3]: offset 88, values offset 2104 (2 entries)
  [1/2104] xmit_count[0 or "em1"] = 1
  [2/2136] xmit_latency[0 or "em1"] = 4577

TOC[4]: offset 104, string offset 168 (6 entries)
  [1/168] xmit device
  [2/424] list of network transmit devices
  [3/680] xmit count metric
  [4/936] number of packets for xmit device
  [5/1192] xmit latency metric
  [6/1448] sum of latency for xmit device
[wcohen@santana mmv]$ ./mmvdump /proc/systemtap/mmmm/mmv 
MMV file   = /proc/systemtap/mmmm/mmv
Version    = 1
Generated  = 854907
TOC count  = 5
Cluster    = 43
Process    = 0
Flags      = 0x0

TOC[0]: offset 40, indoms offset 1704 (1 entries)
  [1/1704] 1 instances, starting at offset 1736
       shorttext=xmit device
       helptext=list of network transmit devices

TOC[1]: offset 56, instances offset 1736 (1 entries)
  [1/1736] instance = [0 or "em1"]

TOC[2]: toc offset 72, metrics offset 1816 (2 entries)
  [1/1816] xmit_count
       type=64-bit int (0x2), sem=counter (0x1), pad=0x0
       units=count
       indom=1
       shorttext=xmit count metric
       helptext=number of packets for xmit device
  [2/1920] xmit_latency
       type=64-bit int (0x2), sem=counter (0x1), pad=0x0
       units=nanosec
       indom=1
       shorttext=xmit latency metric
       helptext=sum of latency for xmit device

TOC[3]: offset 88, values offset 2104 (2 entries)
  [1/2104] xmit_count[0 or "em1"] = 532
  [2/2136] xmit_latency[0 or "em1"] = 1479791

TOC[4]: offset 104, string offset 168 (6 entries)
  [1/168] xmit device
  [2/424] list of network transmit devices
  [3/680] xmit count metric
  [4/936] number of packets for xmit device
  [5/1192] xmit latency metric
  [6/1448] sum of latency for xmit device


-Will

<Prev in Thread] Current Thread [Next in Thread>