On the 2.6.6 server machine:
ifconfig eth0 mtu 9000
gives an oops in the usb?
Unable to handle kernel paging request at virtual address 92a8292a
*pde = 00000000
Oops: 0000 [#1]
EIP: 0060:[<d1163305>] Not tainted
EFLAGS: 00010286 (2.6.6)
EIP is at usb_buffer_free+0x15/0x50 [usbcore]
eax: cea2ec00 ebx: c13665e8 ecx: 00000001 edx: 92a8290a
esi: c13665ec edi: cf0439dc ebp: cf58eef4 esp: c3535f44
ds: 007b es: 007b ss: 0068
Process usb (pid: 2744, threadinfo=c3534000 task=cf245370)
Stack: cba80d00 c13665e8 c13665ec cf0439dc d106e3a6 cea2ec00 00002000
0f636000 c13665e8 d106e4a9 c13665e8 cf122980 cffe0280 c01470d3
cf122980 cf122980 00000000 cf27f200 c3534000 c0145a19 cf122980
[<d106e3a6>] usblp_cleanup+0x46/0xb0 [usblp]
[<d106e4a9>] usblp_release+0x59/0x60 [usblp]
Code: 8b 4a 20 85 c9 74 07 8b 41 18 85 c0 75 04 83 c4 10 c3 8b 44
<6>usb 1-1: new full speed USB device using address 3
drivers/usb/class/usblp.c: usblp0: USB Bidirectional printer dev 3 if 0
alt 0 proto 2 vid 0x04B8 pid 0x0005
ifconfig: page allocation failure. order:3, mode:0x20
[<d110f262>] e1000_alloc_rx_buffers+0x62/0x100 [e1000]
[<d110c045>] e1000_up+0x45/0xb0 [e1000]
[<d110e4fc>] e1000_change_mtu+0x7c/0xd0 [e1000]
MemTotal: 256440 kB
MemFree: 2576 kB
Buffers: 18276 kB
Cached: 202048 kB
SwapCached: 0 kB
Active: 112492 kB
Inactive: 115324 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 256440 kB
LowFree: 2576 kB
SwapTotal: 522100 kB
SwapFree: 522100 kB
Dirty: 8 kB
Writeback: 0 kB
Mapped: 14856 kB
Slab: 16920 kB
Committed_AS: 20272 kB
PageTables: 368 kB
VmallocTotal: 770040 kB
VmallocUsed: 10656 kB
VmallocChunk: 759264 kB
I have had similar on the stable box when it's been used for a while.
ifconfig eth1 mtu 9000
on the good machine and it gave me this:
Jun 18 16:33:08 haze kernel: printk: 1 messages suppressed.
Jun 18 16:33:08 haze kernel: ifconfig: page allocation failure. order:3,
Jun 18 16:33:08 haze kernel: [__alloc_pages+728/848]
Jun 18 16:33:08 haze kernel: [__get_free_pages+37/64]
Jun 18 16:33:08 haze kernel: [kmem_getpages+32/176] kmem_getpages+0x20/0xb0
Jun 18 16:33:08 haze kernel: [cache_grow+166/512] cache_grow+0xa6/0x200
Jun 18 16:33:08 haze kernel: [cache_alloc_refill+342/544]
Jun 18 16:33:08 haze kernel: [__kmalloc+116/128] __kmalloc+0x74/0x80
Jun 18 16:33:08 haze kernel: [alloc_skb+71/224] alloc_skb+0x47/0xe0
Jun 18 16:33:08 haze kernel: [pg0+945227150/1069572096]
Jun 18 16:33:08 haze kernel: [pg0+945213509/1069572096]
Jun 18 16:33:08 haze kernel: [pg0+945223248/1069572096]
Jun 18 16:33:08 haze kernel: [dev_set_mtu+121/144] dev_set_mtu+0x79/0x90
Jun 18 16:33:08 haze kernel: [dev_ioctl+501/640] dev_ioctl+0x1f5/0x280
Jun 18 16:33:08 haze kernel: [inet_ioctl+142/160] inet_ioctl+0x8e/0xa0
Jun 18 16:33:08 haze kernel: [sock_ioctl+233/656] sock_ioctl+0xe9/0x290
Jun 18 16:33:08 haze kernel: [sys_ioctl+239/608] sys_ioctl+0xef/0x260
Jun 18 16:33:08 haze kernel: [do_page_fault+0/1242] do_page_fault+0x0/0x4da
Jun 18 16:33:08 haze kernel: [syscall_call+7/11] syscall_call+0x7/0xb
root@haze:~ # cat /proc/meminfo
MemTotal: 1036868 kB
MemFree: 7564 kB
Buffers: 30720 kB
Cached: 756496 kB
SwapCached: 0 kB
Active: 553348 kB
Inactive: 362700 kB
HighTotal: 131056 kB
HighFree: 252 kB
LowTotal: 905812 kB
LowFree: 7312 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
Mapped: 179532 kB
Slab: 105264 kB
Committed_AS: 298092 kB
PageTables: 1504 kB
VmallocTotal: 114680 kB
VmallocUsed: 2112 kB
VmallocChunk: 112376 kB
I could repeat this by mtu 1500, mtu 9000.
Somehow the distro hadn't mkswap'ed the swap so I added swap and the
problem went away.
if I swapoff then every time I set the mtu to 9000 I get the page
I don't think this should happen but I'm not sure if I *must* have swap?
Also I did this whilst the interface was up (it let me).
Venkatesan, Ganesh wrote:
Did not mean to get off the list. For some reason, my subscription to
netdev is not working (even after re-subscribing). So, I grabbed your
message off of the archive.
I am trying to recreate your failure scenario in our lab. In the
meantime, please send me any new information you have on this issue.
Network/Storage Division, Hillsboro, OR
From: David Greaves [mailto:david@xxxxxxxxxxxx]
Sent: Friday, June 18, 2004 5:52 AM
To: Jens Laas
Cc: Stephen Hemminger; netdev@xxxxxxxxxxx; Venkatesan, Ganesh
Subject: Re: 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out+
I booted into XP and the card works there - so it doesn't look like a
simple hardware incompatibility.
[I've got no real way to test the performance but cygwin's wget against
apache1.3 on the linux box returns about 25M/s initially and then 15M/s
sustained for 500Mb]
Jens Laas wrote:
I'm speaking with Ganesh Venkatesan at intel about it. Ganesh you
went off list - do you want to include Jens or maybe go back on-list?
If others run into this problem I'm sure they'll appreciate if its on
Since we have no idea what causes this (AFAIK) it may be a more
general problem than the device driver.
I tend to agree - but I wasn't sure if this was the place and I'll do as
I'm told ;)
A simple failure case for me is : 'ping -s 1500 '
This doesn't cause the timout but doesn't succeed either.
ping -f with standard packet size succeeds (slow rate though) and
I dont see the ping problems at all. Unless you try to ping when the
interface has "hanged" ?
<sigh> thought that might be helpful.
Ping with -s and -f seems to allow me to trigger errors and it seems a
lot more debug-able than scp or nfs :)
No all tests are when it's reset and 'clean'
From hereon down it's 2.6.7 with Stephen's recent delay scheduler
This changed the behaviour.
This is strange unless you are actually using the delay scheduler ?
Default is sch_generic (that is pfifo) that does not exhibit the
problems correct by the patch.
I'll go back and double check in case I cocked up...
(I noticed the e1000 module rebuild but you're right that's incidental)
I've rebuilt the kernel and modules with and w/o patch and rebooted a
few times and I can't reproduce that effect - sorry for the red herring.
So after I reverted Stephens patch the results I reported are still
reproducable w/o the patch.
10592 packets transmitted, 10591 packets received, 0% packet loss
round-trip min/avg/max = 5.4/5.5/83.5 ms
Increasing Transmit Descriptors to 4096 avoids the No buffer space
available with packet sizes up to -s65468 (still 100% failure though)
Increasing nr of buffers is not a way to fix the problem.
agreed - however in my ignorance of the deep behaviour I'm reporting
things that affect behaviour in ways I don't expect.
I expected it to take longer to run out of buffers - that didn't happen
(Anyway, on retesting I find that this was wrong - I suspect the
interface was down and I didn't notice)
I had hoped to hear something about this from Scott..
I'm happy to hear from anyone - I don't have *that* long until my RMA
option expires and I don't fancy keeping them as ornaments!