David Fries wrote:
> 'net drop out' problem,
> There are two stages, reduced network and no network. For example
> when I do a `ping -s 15000 aerospace` ping from spacedout (troubled
> computer) to aerospace (another one), I'll get response times of
> either 4ms or 3000ms.
> When networking stops I don't get any packets received or interrupts,
> but I and showing RX overruns incrementing. When I ping from
> spacedout, spacedout shows an arp request going out, aerospace sees
> the arp request, but spacedout never sees the reply.
This is consistent with an interrupt controller failure. However if
this was the case you should be seeing "NETDEV WATCHDOG: eth0: transmit
timed out" messages and "interrupt posted but not delivered" messages.
Are you sure you're not?
Another test: when spacedout is in this state, go to its console and
ping another machine. Watch /proc/interrupts to see if you're getting
If you are getting tx interrupts then perhaps the NIC is getting its
registers unprogrammed, or perhaps the multicast filter has gone silly.
Try `ifconfig eth0 promisc'.
Or try a new PCI slot.
Or a new power supply.
Or a new computer.
BTW, I'm currently typing on a K6-2 machine (wildly overclocked - this
is my main workstation/router/firewall/server :)). It's running
2.4.0-test8-pre1 with a 3c905B. Solid as a rock. Different motherboard
> I not sure, I think it should work, but it would matter on your mount
OK, I was asking because this problem is related to IP fragmentation,
and I assume (perhaps wrongly) that if rsize and wsize are larger than
your MTU, there will be a lot of fragmented packets.
> > Are you able to provide a set of steps with which others can reproduce
> > this?
> 'net drop out'
> I'll just say no. AeroSpace is running SMP, spacedout is not SMP.
> AeroSpace is a dual Pentium MMX, Spacedout is a K6-2. They have
> basically identical network cards in them 3c905b, I have swaped the
> network cards in the past and the problems follow the computer not the
> I would suggest try getting a FIC VA 503+ motherboard, K6-2 processor,
> 3c905B network card, go in X, have something rapidly updating the
> video card (rxvt doing `locate \*` worked fine), and send a ton of
> network data to the system at 100BaseT.
I just did that here:
ping -q -f -s 64 -l 100000 bix
This caused `bix' to take a short trip to an alternate universe, but it
recovered fine when I killed the ping.
> If you REALLY pulled my leg you might get me to put one of my Pentium
> processors in the system, but I would rather not do that.
Sorry, I think you need to start swapping hardware in spacedout. It's
> The new problem about 'unregister_netdevice: waiting ...' I can
> reproduce it by,
> insmod 3c59x
> ifconfig eth0 ...
> (on another console) ping -s 15000 -f aerospace
> ifconfig eth0 down; rmmod 3c59x
> That usually gives about two lines of 'unregister_netdevice...' before
> is able to be removed.
That's normal. There are orphaned IP fragments floating about in your
kernel. They have a thirty second lifetime. When they have all expired
the module unload is allowed to proceed.
> Odd thing about the 'unregister_netdevice' problem is I was still able
> to unload the module until I inserted my ne2000 card and ifconfiged it
> I did,
> insmod 3c59x
> modprobe ne io=0x300 irq=111
> ifconfig eth0 ...
> ifconfig eth1 ...
> ifconfig eth0 down
> rmmod 3c59x
> and it keep giving, 'unregister_netdevice' message over and over until
> I rebooted.
I tried many combinations of this with a eepro100 and a 3c905C.
Everything worked fine. Sigh.
[ In reply to a later email ]
> What does rmmod and insmod do to the network card that vortex_down,
> vortex_up doesn't? Something is different.
All the stuff in vortex_probe1() is run at insmod-time only. It's
mainly driver data structure initialisation, but there's some hardware
initialisation as well.
David, If this problem is purely exhibited on `spacedout' then it's
quite possible that there are no software problems, although that
unregister_netdevice problem sure looks like software to me... My
recommendation is to start swapping out hardware. You get some
amazingly wierd stuff happening if the hardware is dodgy.