[Top] [All Lists]

Re: [Fwd: Re: possible bug x86 2.4.2 SMP in IP receive stack]

To: andrewm@xxxxxxxxxx, kuznet@xxxxxxxxxxxxx
Subject: Re: [Fwd: Re: possible bug x86 2.4.2 SMP in IP receive stack]
From: Bob Felderman <feldy@xxxxxxxx>
Date: Tue, 6 Mar 2001 10:54:22 -0800 (PST)
Cc: feldy@xxxxxxxx, netdev@xxxxxxxxxxx
Sender: owner-netdev@xxxxxxxxxxx
=> I am looking now. Probably, it is some silly misprint in ip_fragment.c.
=> The problem with Bob's original report was that in the first lines 
=> he reported an illegal kfree_skb with skb->list!=NULL, called
=> from ip_rcv(). This can be only bug in driver, nothing more.
=> Actually, Bob, if you will say me that you found why this happened,
=> my enthusiasm in reauditing ip_fragment.c will grow just fantastically. 8)

I have added in spinlocks for the interrupt routine and the transmit side.
I don't see more stability and the oopses I sent yesterday are using the 
spinlocked code.

If you don't see the skb->list anymore, it is probably because of the 
but I'm certainly seeing bad behavior.

I generate the problem with a udp test from netperf (
I've attached my
script. Run it as
        udp_range  <dest_host>

I also do this

echo "1048576" > /proc/sys/net/core/rmem_max
echo "1048576" > /proc/sys/net/core/wmem_max
echo "1048576" > /proc/sys/net/core/wmem_default
echo "1048576" > /proc/sys/net/core/rmem_default
echo "1048576" > /proc/sys/net/core/optmem_max

Our network is really fast. When the machine is stable I can sustain
a 1.5Gigabit/sec udp stream. It might be possible to reproduce this
using 100Mbit ethernet, but it might require 1gbit ethernet with jumbo
frames. I'm currently using a 9000 bytes MTU. Our interrupt routine
will deliver only a single ethernet packet to the higher levels
for each interrupt, so maybe that also stresses the IP fragmentation

I don't see this problem on a linux-2.2 box and I don't see it when
I remove one of the processors from the receiver on my setup.

# udp_range
# generate a whole lot of numbers from netperf to see the effects
# of send size on thruput

# usage : udp_range hostname

if [ $# -gt 1 ]; then
        echo "try again, correctly -> udp_range hostname"
        exit 1

# some params
if [ $# -eq 1 ]; then
        echo "try again, correctly -> udp_range hostname"
        exit 1

# where is netperf

BUFSIZE="-s 2062144 -S 2062144"
#BUFSIZE="-s  2147484 -S 2147484"
#BUFSIZE="-s 1048576 -S 1048576"
#BUFSIZE="-s 524288 -S 524288"
#BUFSIZE="-s 262144 -S 262144"
#BUFSIZE="-s 131072 -S 131072"
#BUFSIZE="-s 65535 -S 65535"
#BUFSIZE="-s 49152 -S 49152"
#BUFSIZE="-s 49152 -S 131072"
#BUFSIZE="-S 65536"

# some stuff for the arithmatic 
# we start at start, and then multiply by MULT and add ADD. by changing
# these numbers, we can double each time, or increase by a fixed ammount



# Do we wish to measure CPU utilization?

# If we are measuring CPU utilization, then we can save beaucoup
# time by saving the results of the CPU calibration and passing
# them in during the real tests. So, we execute the new CPU "tests"
# of netperf and put the values into shell vars.
case $LOC_CPU in
\-c) LOC_RATE=`$NETHOME/netperf -t LOC_CPU`;;
*) LOC_RATE=""

case $REM_CPU in
*) REM_RATE=""

# after the first datapoint, we don't want more headers
# but we want one for the first one

while [ $MESSAGE -ge $END ]; do
        $NETHOME/netperf -p 9100 -l $TIME -H $REMHOST -t UDP_STREAM\
          -m $MESSAGE $BUFSIZE
        NO_HDR="-P 0"
        MESSAGE=`expr $MESSAGE + $ADD`
        MESSAGE=`expr $MESSAGE \/ $DIV`

<Prev in Thread] Current Thread [Next in Thread>