xfs
[Top] [All Lists]

Kernel oops on new mailserver

To: XFS mailing list <linux-xfs@xxxxxxxxxxx>
Subject: Kernel oops on new mailserver
From: Paul Schutte <paul@xxxxxxxxxxx>
Date: Mon, 18 Mar 2002 22:42:24 +0200
Sender: owner-linux-xfs@xxxxxxxxxxx
Hi,

I have set up a new mailserver, consisting of the following:

Dell PE2550
1G RAM
2 x 1.133GHz CPUs
4x18Gb Seagate cheeta's in RAID 10

kernel 2.4.18 from cvs checked out on March 14 2002.
gcc version egcs-2.91.66 19990314 (egcs-1.1.2 release)
ksymoops 2.4.3
Debian 3.0 (woody)
JFS 1.0.15 were also patched into the kernel, but no jfs partitions were
mounted at the time.
I also compiled a kernel without any other filesystem in it. Not even
ext2, but also got the problem.
I did'nt had a terminal connected at that time, so I don't have an oops
where XFS was the only
filesystem.

The kernel oopsed several times lately.
I caught an oops, but couldn't do stack traces.
I caught another one, this time with traces on both CPUs.
The e100 ethernet module from intel was loaded in the first oops.
I reverted back to the stock eepro100 module in the hope that it might
solve the problem.
The second oops was with the eepro100 module loaded
No other modules were loaded.


First oops:

kernel BUG at ll_rw_blk.c:978!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c026eec0>]    Tainted: P
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 0000001f   ebx: 00000805   ecx: c0436068   edx: 00002e5a
esi: 00000000   edi: c87e5d10   ebp: c87e5ce8   esp: c87e5cb4
ds: 0018   es: 0018   ss: 0018
Process modprobe (pid: 3891, stackpage=c87e5000)
Stack: c0362282 000003d2 cac4f6a0 cace26e0 c87e5d34 cace26e0 00000040
07eb9067
       c87e5cec c012420b 00001000 00000200 00000200 c87e5ef8 c0136efc
00000001
       00000001 c87e5d10 cac4f680 00000001 c043d920 c87e5d34 00000000
cace26e0
Call Trace: [<c012420b>] [<c0136efc>] [<c01e7d61>] [<c01e7d61>]
[<c0231594>]
   [<c013734a>] [<c013736d>] [<c023fe80>] [<c0240212>] [<c023f001>]
[<c02408e9>]
   [<c024091b>] [<c0231583>] [<c0231583>] [<c021d214>] [<c023a31d>]
[<c0245a65>]
   [<c023f085>] [<c0241d59>] [<c02361e2>] [<c0241926>] [<c01366b8>]
[<c010720f>]
Code: 0f 0b 83 c4 08 8d 76 00 83 c7 04 46 3b 75 0c 7c bb 8b 4d 08

>>EIP; c026eec0 <ll_rw_block+8c/1f4>   <=====
Trace; c012420a <do_anonymous_page+f2/11c>
Trace; c0136efc <fsync_inode_data_buffers+bc/1a0>
Trace; c01e7d60 <xfs_acl_iaccess+28/84>
Trace; c01e7d60 <xfs_acl_iaccess+28/84>
Trace; c0231594 <xfs_trans_unlocked_item+34/40>
Trace; c013734a <__refile_buffer+56/60>
Trace; c013736c <refile_buffer+18/24>
Trace; c023fe80 <set_buffer_dirty_uptodate+34/48>
Trace; c0240212 <__pb_block_commit_write_async+2e/4c>
Trace; c023f000 <pagebuf_commit_write+48/b4>
Trace; c02408e8 <pagebuf_generic_file_write+2a8/314>
Trace; c024091a <pagebuf_generic_file_write+2da/314>
Trace; c0231582 <xfs_trans_unlocked_item+22/40>
Trace; c0231582 <xfs_trans_unlocked_item+22/40>
Trace; c021d214 <xfs_iunlock+4c/58>
Trace; c023a31c <xfs_rwunlock+30/68>
Trace; c0245a64 <xfs_write+48c/49c>
Trace; c023f084 <pagebuf_flush+18/2c>
Trace; c0241d58 <fs_flush_pages+28/34>
Trace; c02361e2 <xfs_fsync+ea/288>
Trace; c0241926 <linvfs_fsync+42/50>
Trace; c01366b8 <sys_fdatasync+68/b4>
Trace; c010720e <system_call+2e/34>
Code;  c026eec0 <ll_rw_block+8c/1f4>
00000000 <_EIP>:
Code;  c026eec0 <ll_rw_block+8c/1f4>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c026eec2 <ll_rw_block+8e/1f4>
   2:   83 c4 08                  add    $0x8,%esp
Code;  c026eec4 <ll_rw_block+90/1f4>
   5:   8d 76 00                  lea    0x0(%esi),%esi
Code;  c026eec8 <ll_rw_block+94/1f4>
   8:   83 c7 04                  add    $0x4,%edi
Code;  c026eeca <ll_rw_block+96/1f4>
   b:   46                        inc    %esi
Code;  c026eecc <ll_rw_block+98/1f4>
   c:   3b 75 0c                  cmp    0xc(%ebp),%esi
Code;  c026eece <ll_rw_block+9a/1f4>
   f:   7c bb                     jl     ffffffcc <_EIP+0xffffffcc>
c026ee8c <ll_rw_block+58/1f4>
Code;  c026eed0 <ll_rw_block+9c/1f4>
  11:   8b 4d 08                  mov    0x8(%ebp),%ecx

Entering kdb (current=0xc87e4000, pid 3891) on processor 0 Oops: invalid
operand



Second oops:



kernel BUG at ll_rw_blk.c:978!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c026eec0>]    Tainted: P
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 0000001f   ebx: 00000805   ecx: c0436068   edx: 00003217
esi: 00000000   edi: c87c7d10   ebp: c87c7ce8   esp: c87c7cb4
ds: 0018   es: 0018   ss: 0018
Process modprobe (pid: 4414, stackpage=c87c7000)
Stack: c0362282 000003d2 c3f4e4c0 c26f47a0 c87c7d34 c26f47a0 00000040
0343a067
       c87c7cec c012420b 00001000 00000200 00000200 c87c7ef8 c0136efc
00000001
       00000001 c87c7d10 c3f4e4a0 00000001 c043d920 c87c7d34 00000000
c26f47a0
Call Trace: [<c012420b>] [<c0136efc>] [<c01e7d61>] [<c01e7d61>]
[<c0231594>]
   [<c023fe72>] [<c0240212>] [<c023f001>] [<c02408e9>] [<c024091b>]
[<c0231583>]
   [<c021d214>] [<c023f085>] [<c0241d59>] [<c02361e2>] [<c0241926>]
[<c01366b8>]
   [<c010720f>]
Code: 0f 0b 83 c4 08 8d 76 00 83 c7 04 46 3b 75 0c 7c bb 8b 4d 08

>>EIP; c026eec0 <ll_rw_block+8c/1f4>   <=====
Trace; c012420a <do_anonymous_page+f2/11c>
Trace; c0136efc <fsync_inode_data_buffers+bc/1a0>
Trace; c01e7d60 <xfs_acl_iaccess+28/84>
Trace; c01e7d60 <xfs_acl_iaccess+28/84>
Trace; c0231594 <xfs_trans_unlocked_item+34/40>
Trace; c023fe72 <set_buffer_dirty_uptodate+26/48>
Trace; c0240212 <__pb_block_commit_write_async+2e/4c>
Trace; c023f000 <pagebuf_commit_write+48/b4>
Trace; c02408e8 <pagebuf_generic_file_write+2a8/314>
Trace; c024091a <pagebuf_generic_file_write+2da/314>
Trace; c0231582 <xfs_trans_unlocked_item+22/40>
Trace; c021d214 <xfs_iunlock+4c/58>
Trace; c023f084 <pagebuf_flush+18/2c>
Trace; c0241d58 <fs_flush_pages+28/34>
Trace; c02361e2 <xfs_fsync+ea/288>
Trace; c0241926 <linvfs_fsync+42/50>
Trace; c01366b8 <sys_fdatasync+68/b4>
Trace; c010720e <system_call+2e/34>
Code;  c026eec0 <ll_rw_block+8c/1f4>
00000000 <_EIP>:
Code;  c026eec0 <ll_rw_block+8c/1f4>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c026eec2 <ll_rw_block+8e/1f4>
   2:   83 c4 08                  add    $0x8,%esp
Code;  c026eec4 <ll_rw_block+90/1f4>
   5:   8d 76 00                  lea    0x0(%esi),%esi
Code;  c026eec8 <ll_rw_block+94/1f4>
   8:   83 c7 04                  add    $0x4,%edi
Code;  c026eeca <ll_rw_block+96/1f4>
   b:   46                        inc    %esi
Code;  c026eecc <ll_rw_block+98/1f4>
   c:   3b 75 0c                  cmp    0xc(%ebp),%esi
Code;  c026eece <ll_rw_block+9a/1f4>
   f:   7c bb                     jl     ffffffcc <_EIP+0xffffffcc>
c026ee8c <ll_rw_block+58/1f4>
Code;  c026eed0 <ll_rw_block+9c/1f4>
  11:   8b 4d 08                  mov    0x8(%ebp),%ecx

Entering kdb (current=0xc87c6000, pid 4414) on processor 0 Oops: invalid
operand
due to oops @ 0xc026eec0
eax = 0x0000001f ebx = 0x00000805 ecx = 0xc0436068 edx = 0x00003217
esi = 0x00000000 edi = 0xc87c7d10 esp = 0xc87c7cb4 eip = 0xc026eec0
ebp = 0xc87c7ce8 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010202
xds = 0x00000018 xes = 0x00000018 origeax = 0xffffffff &regs =
0xc87c7c80
[0]kdb> bt
    EBP       EIP         Function(args)
0xc87c7ce8 0xc026eec0 ll_rw_block+0x8c (0x1, 0x1, 0xc87c7d10,
0xc3f4e4a0, 0x1)
                               kernel .text 0xc0100000 0xc026ee34
0xc026f028
0xc87c7ef8 0xc0136efc fsync_inode_data_buffers+0xbc (0xc3f4e4a0,
0xc3f4e554, 0x0)
                               kernel .text 0xc0100000 0xc0136e40
0xc0136fe0
0xc87c7f0c 0xc023f085 pagebuf_flush+0x19 (0xc3f4e4a0, 0x0, 0x0, 0x0,
0xc3f4e5c4)
                               kernel .text 0xc0100000 0xc023f06c
0xc023f098
0xc87c7f28 0xc0241d59 fs_flush_pages+0x29 (0xce7a7748, 0x0, 0x0,
0xffffffff, 0xffffffff)
                               kernel .text 0xc0100000 0xc0241d30
0xc0241d64
0xc87c7f64 0xc02361e2 xfs_fsync+0xea (0xce7a7748, 0x5, 0x0, 0x0, 0x0)
                               kernel .text 0xc0100000 0xc02360f8
0xc0236380
0xc87c7f90 0xc0241926 linvfs_fsync+0x42 (0xcc608240, 0xcddd2ec0, 0x1,
0xc3f4e554, 0xc87c6000)
                               kernel .text 0xc0100000 0xc02418e4
0xc0241934
0xc87c7fbc 0xc01366b8 sys_fdatasync+0x68 (0x0, 0x8063258, 0xbfffeca0,
0x8063258, 0x4013b6e0)
                               kernel .text 0xc0100000 0xc0136650
0xc0136704
           0xc010720f system_call+0x2f
                               kernel .text 0xc0100000 0xc01071e0
0xc0107214
[0]kdb> cpu
Currently on cpu 0
Available cpus: 0, 1
[0]kdb> cpu 1

Entering kdb (current=0xcf6f6000, pid 160) on processor 1 due to cpu
switch
[1]kdb> bt
    EBP       EIP         Function(args)
0xcf6f7e38 0xc0124522 pte_alloc+0xe (0xcf986660, 0xcfe90dc0, 0xcf6f6000)

                               kernel .text 0xc0100000 0xc0124514
0xc01245f4
0xcf6f7e4c 0xc012445b handle_mm_fault+0x3b (0xcfe90dc0, 0xcf986660,
0x804dca0, 0x1, 0xcf6f6000)
                               kernel .text 0xc0100000 0xc0124420
0xc01244e0
0xcf6f7f10 0xc0111ad7 do_page_fault+0x1af (0xc14e3080, 0xc1414000,
0xf6f8000, 0xcd29cf40, 0xc04c2900)
                               kernel .text 0xc0100000 0xc0111928
0xc0111df3
0xcf6f7f04 0xc030ae0a unix_dgram_sendmsg+0x3c6 (0xcf6f7f20, 0x2, 0x0,
0x804dca0, 0x31f6)
                               kernel .text 0xc0100000 0xc030aa44
0xc030ae74
           0xc01072f8 error_code+0x34
                               kernel .text 0xc0100000 0xc01072c4
0xc0107300
Interrupt registers:
eax = 0x00000000 ebx = 0xcf6f7f20 ecx = 0x00000002 edx = 0x00000000
esi = 0x0804dca0 edi = 0x000031f6 esp = 0x00000010 eip = 0x00000018
ebp = 0xc04e6ca0 xss = 0x00010246 xcs = 0xffffffff eflags = 0xc01157f7
xds = 0xcf6f7f7c xes = 0x0000313c origeax = 0x00000018 &regs =
0xcf6f7f18
Interrupt from user space, end of kernel trace
[1]kdb> lsmod
Module                  Size  modstruct     Used by
eepro100               20152  0xd087f000     1


Pleases let me know if there is anything else that I must check.

Paul Schutte


<Prev in Thread] Current Thread [Next in Thread>