Alexey,
Bob is consistently getting these oopses running netperf with
UDP on 2.4.2. He's using the myrinet hardware and drivers. It's
very high speed - over 100 mbytes/sec. I believe he's experiencing
out-of-memory conditions.
Sometimes he also gets assertion failures from ip_frag_destroy.
`del_timer == 0'.
Can you think of anything which would cause this to happen
in an out-of-memory situation?
BTW - the drivers he's using aren't in the stock kernel. So
it's possible that this is a driver problem which we haven't
seen before. I had a look at the driver he's using and the
Rx path seems OK.
Thanks.
-------- Original Message --------
Subject: Re: possible bug x86 2.4.2 SMP in IP receive stack
Date: Mon, 5 Mar 2001 14:47:34 -0800 (PST)
From: Bob Felderman <feldy@xxxxxxxx>
To: Andrew Morton <andrewm@xxxxxxxxxx>
CC: Bob Felderman <feldy@xxxxxxxx>
> Bob Felderman wrote:
> >
> > I'll get an oops dump as soon as I get back into the office.
> > Any simple way to get the oops into the log so I don't
> > have to copy it down?
> > .
>
this is with a stock linux-2.4.2 kernel
ksymoops 0.7c on i686 2.4.2. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.2/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
Warning (compare_maps): mismatch on symbol __module_author , gm says d0888e40,
sbin/gm says d088aaa0. Ignoring sbin/gm entry
Warning (compare_maps): mismatch on symbol __module_description , gm says
d0888e5f, sbin/gm says d088aabf. Ignoring sbin/gm entry
Warning (compare_maps): mismatch on symbol __module_parm_gm_net_copy_threshold
, gm says d0888ebc, sbin/gm says d088ab1c. Ignoring sbin/gm entry
Warning (compare_maps): mismatch on symbol __module_parm_gmip_hw_checksum , gm
says d0888ea4, sbin/gm says d088ab04. Ignoring sbin/gm entry
Unable to handle kernel NULL pointer dereference at virtual address 00000004
c011bd45
*pde = 00000000
Oops: 0002
CPU: 1
EIP: 0010:[<c011bd45>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010086
eax: cff05b84 ebx: 00000212 ecx: cff05b84 edx: 00000000
esi: 00000000 edi: 00000068 ebp: c02ed840 esp: c1449e54
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c1449000)
Stack: 00000000 cff05b60 00000068 c01f63eb cff05b84 0006bc85 c23b8040 3c82c9c7
c23b8040 00001d5d 00000068 c01f64c2 0000001a cff05b60 c95e8011 c01f6af9
0000001a c23b8040 c95e80e0 c95e80e0 c95e80e0 c95e80e0 c23b8040 3782c9c7
Call Trace: [<c01f63eb>] [<c01f64c2>] [<c01f6af9>] [<c01f5c50>] [<c01f6011>]
[<c01ed73e>] [<c01193
[<c010a99a>] [<c01071c0>] [<c01071c0>] [<c010909c>] [<c01071c0>]
[<c01071c0>] [<c0100018>]
[<c0107252>] [<c01193aa>] [<c010a99a>]
Code: 89 4a 04 89 11 89 41 04 89 08 c6 05 9c 98 28 c0 01 53 9d 89
>>EIP; c011bd45 <mod_timer+d1/ec> <=====
Trace; c01f63eb <ip_frag_intern+a3/ec>
Trace; c01f64c2 <ip_frag_create+8e/a4>
Trace; c01f6af9 <ip_defrag+cd/184>
Trace; c01f5c50 <ip_local_deliver+1c/114>
Trace; c01f6011 <ip_rcv+2c9/338>
Trace; c01ed73e <net_rx_action+17e/278>
Trace; c010a99a <do_IRQ+da/ec>
Trace; c01071c0 <default_idle+0/34>
Trace; c01071c0 <default_idle+0/34>
Trace; c010909c <ret_from_intr+0/20>
Trace; c01071c0 <default_idle+0/34>
Trace; c01071c0 <default_idle+0/34>
Trace; c0100018 <startup_32+18/cb>
Trace; c0107252 <cpu_idle+3e/54>
Trace; c01193aa <do_softirq+5a/88>
Trace; c010a99a <do_IRQ+da/ec>
Code; c011bd45 <mod_timer+d1/ec>
00000000 <_EIP>:
Code; c011bd45 <mod_timer+d1/ec> <=====
0: 89 4a 04 mov %ecx,0x4(%edx) <=====
Code; c011bd48 <mod_timer+d4/ec>
3: 89 11 mov %edx,(%ecx)
Code; c011bd4a <mod_timer+d6/ec>
5: 89 41 04 mov %eax,0x4(%ecx)
Code; c011bd4d <mod_timer+d9/ec>
8: 89 08 mov %ecx,(%eax)
Code; c011bd4f <mod_timer+db/ec>
a: c6 05 9c 98 28 c0 01 movb $0x1,0xc028989c
Code; c011bd56 <mod_timer+e2/ec>
11: 53 push %ebx
Code; c011bd57 <mod_timer+e3/ec>
12: 9d popf
Code; c011bd58 <mod_timer+e4/ec>
13: 89 00 mov %eax,(%eax)
Kernel panic: Aiee, killing interrupt handler!
5 warnings issued. Results may not be reliable.
|