netdev
[Top] [All Lists]

Re: Perf data with recent tg3 patches

To: "David S.Miller" <davem@xxxxxxxxxxxxx>
Subject: Re: Perf data with recent tg3 patches
From: Arthur Kepner <akepner@xxxxxxx>
Date: Fri, 20 May 2005 14:52:35 -0700 (PDT)
Cc: mchan@xxxxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <20050513.222007.78719997.davem@xxxxxxxxxxxxx>
References: <Pine.LNX.4.61.0505131648140.14917@xxxxxxxxxx> <20050513.175013.00786860.davem@xxxxxxxxxxxxx> <1116031159.6214.8.camel@rh4> <20050513.222007.78719997.davem@xxxxxxxxxxxxx>
Sender: netdev-bounce@xxxxxxxxxxx
Here are a couple more data points with the recent 
interrupt coalescing patches for the tg3 driver. 
Please see the attached graphs, which show how CPU 
utilization, and number of received packets per 
interrupt vary with link utilization. The workload 
here is receive-as-fast-as-you-can-over-TCP, with 
a single sending and a single receiving process.

The data in the graphs is labelled as follows:
----------------------------------------------
2.6.12-rc3: unmodified 2.6.12-rc3 

dflt coal : 2.6.12-rc3 + [1] + [2] + [3] + [4]
            using the default intr coalescence 
            values (rx-frames = rx-usecs-irq = 5)

4x coal   : 2.6.12-rc3 + [1] + [2] + [3] + [4] + [5]
            using 4 times the default values 
            for rx-frames and rx-frames-irq

[1] http://marc.theaimsgroup.com/?l=linux-netdev&m=111446723510962&w=2
    (This is one of a series of 3 patches - the others can't be
     found in the archive. But they're all in 2.6.12-rc4.)
[2] http://marc.theaimsgroup.com/?l=linux-netdev&m=111567944730302&w=2
    ("tagged status" patch)
[3] http://marc.theaimsgroup.com/?l=linux-netdev&m=111586526522981&w=2
    ("hw coalescing infrastructure" patch)
[4] http://marc.theaimsgroup.com/?l=linux-netdev&m=111604846510646&w=2
    ("tagged status update")
[5] the patches below which allow "ethtool -[cC]" to work


Patch [4] almost entirely eliminates updates to the tag in 
the status block between when it's been saved in tg3_poll() 
and when it's written back to the NIC in tg3_restart_ints(). 
It still happens, but the frequency is a few times in a 
thousand, so it doesn't significantly affect the interrupt 
rate. 

I had to make a couple of changes to allow setting/
retrieving the coalescence parameters with ethtool. Those 
patches are at the end. 

When the default coalescence parameters are used (rx-frames, 
rx-frames-irq both set to 5) the maximum number of received 
packets per interrupt is ~4.2. Setting rx-frames and 
rx-frames-irq to 20 caused the maximum number of received
packets per interrupt to rise to ~19.6. Maximum CPU 
utilization went down from ~52% to ~35%. Very nice.


Fix typo in ethtool_set_coalesce()

Signed-off-by: Arthur Kepner <akepner@xxxxxxx>

--- linux.save/net/core/ethtool.c       2005-05-20 12:40:04.426385446 -0700
+++ linux/net/core/ethtool.c    2005-05-20 12:49:34.087515306 -0700
@@ -347,7 +347,7 @@ static int ethtool_set_coalesce(struct n
 {
        struct ethtool_coalesce coalesce;
 
-       if (!dev->ethtool_ops->get_coalesce)
+       if (!dev->ethtool_ops->set_coalesce)
                return -EOPNOTSUPP;
 
        if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))


Changes to allow setting/getting coalescence parameters 
with tg3.

Signed-off-by: Arthur Kepner <akepner@xxxxxxx>

--- linux.save/drivers/net/tg3.c        2005-05-20 13:02:41.610865448 -0700
+++ linux/drivers/net/tg3.c     2005-05-20 13:11:36.467011288 -0700
@@ -5094,8 +5094,11 @@ static void tg3_set_bdinfo(struct tg3 *t
 }
 
 static void __tg3_set_rx_mode(struct net_device *);
-static void tg3_set_coalesce(struct tg3 *tp, struct ethtool_coalesce *ec)
+static int tg3_set_coalesce(struct net_device *dev, 
+                               struct ethtool_coalesce *ec)
 {
+       struct tg3 *tp = netdev_priv(dev);
+
        tw32(HOSTCC_RXCOL_TICKS, ec->rx_coalesce_usecs);
        tw32(HOSTCC_TXCOL_TICKS, ec->tx_coalesce_usecs);
        tw32(HOSTCC_RXMAX_FRAMES, ec->rx_max_coalesced_frames);
@@ -5114,6 +5117,9 @@ static void tg3_set_coalesce(struct tg3 
 
                tw32(HOSTCC_STAT_COAL_TICKS, val);
        }
+
+       memcpy(&tp->coal, ec, sizeof(tp->coal));
+       return 0;
 }
 
 /* tp->lock is held. */
@@ -5437,7 +5443,7 @@ static int tg3_reset_hw(struct tg3 *tp)
                udelay(10);
        }
 
-       tg3_set_coalesce(tp, &tp->coal);
+       tg3_set_coalesce(tp->dev, &tp->coal);
 
        /* set status block DMA address */
        tw32(HOSTCC_STATUS_BLK_HOST_ADDR + TG3_64BIT_REG_HIGH,
@@ -7302,6 +7308,8 @@ static int tg3_get_coalesce(struct net_d
        return 0;
 }
 
+static int tg3_set_coalesce(struct net_device *, struct ethtool_coalesce *);
+
 static struct ethtool_ops tg3_ethtool_ops = {
        .get_settings           = tg3_get_settings,
        .set_settings           = tg3_set_settings,
@@ -7335,6 +7343,7 @@ static struct ethtool_ops tg3_ethtool_op
        .get_stats_count        = tg3_get_stats_count,
        .get_ethtool_stats      = tg3_get_ethtool_stats,
        .get_coalesce           = tg3_get_coalesce,
+       .set_coalesce           = tg3_set_coalesce,
 };
 
 static void __devinit tg3_get_eeprom_size(struct tg3 *tp)

--
Arthur

Attachment: cpu_vs_link.4.png
Description: CPU utilization

Attachment: packets_vs_link.4.png
Description: Received Pkts/Intr

<Prev in Thread] Current Thread [Next in Thread>