Hi,
I've got the following oops on an Althon-based machine running
2.4.22-xfs (SGI XFS snapshot 2.4.22-2003-09-03_04:09_UTC with no debug
enabled) :
Unable to handle kernel NULL pointer dereference at virtual address 00000004
c0131f6b
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0131f6b>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010046
eax: 00000003 ebx: c184fa74 ecx: e3dc6020 edx: 00000004
esi: 01a0a8ff edi: efd2b110 ebp: efd2b118 esp: efc77f10
ds: 0018 es: 0018 ss: 0018
Process xfssyncd (pid: 15, stackpage=efc77000)
Stack: 00000282 ed128678 efd2b110 c013185e c184fa74 e3dc6bb0 ed1287d8 c01a3c8e
c184fa74 e3dc6bb0 c01a205a ed1287d8 00000000 ed1287d8 c01c01ab ed1287d8
e3dc6bb0 ed1287d8 c01c02be ed1287d8 00000001 00000002 00000000 00000071
Call Trace: [<c013185e>] [<c01a3c8e>] [<c01a205a>] [<c01c01ab>]
[<c01c02be>] [<c01b986c>] [<c01b8ec9>] [<c01ce954>] [<c01cdd67>]
[<c01057ab>] [<c01cdcc0>]
Code: 89 02 89 50 04 c7 01 00 00 00 00 8b 43 10 8d 53 10 89 48 04
>>EIP; c0131f6b <kmem_cache_free_one+8b/a9> <=====
>>ebx; c184fa74 <_end+1487d8c/30465398>
>>ecx; e3dc6020 <_end+239fe338/30465398>
>>edi; efd2b110 <_end+2f963428/30465398>
>>ebp; efd2b118 <_end+2f963430/30465398>
>>esp; efc77f10 <_end+2f8b0228/30465398>
Trace; c013185e <kmem_cache_free+1e/30>
Trace; c01a3c8e <xfs_inode_item_destroy+1e/30>
Trace; c01a205a <xfs_idestroy+7a/b0>
Trace; c01c01ab <xfs_finish_reclaim+bb/f0>
Trace; c01c02be <xfs_finish_reclaim_all+de/f0>
Trace; c01b986c <xfs_syncsub+6c/320>
Trace; c01b8ec9 <xfs_sync+29/30>
Trace; c01ce954 <vfs_sync+34/40>
Trace; c01cdd67 <syncd+a7/d0>
Trace; c01057ab <arch_kernel_thread+2b/40>
Trace; c01cdcc0 <syncd+0/d0>
Code; c0131f6b <kmem_cache_free_one+8b/a9>
00000000 <_EIP>:
Code; c0131f6b <kmem_cache_free_one+8b/a9> <=====
0: 89 02 mov %eax,(%edx) <=====
Code; c0131f6d <kmem_cache_free_one+8d/a9>
2: 89 50 04 mov %edx,0x4(%eax)
Code; c0131f70 <kmem_cache_free_one+90/a9>
5: c7 01 00 00 00 00 movl $0x0,(%ecx)
Code; c0131f76 <kmem_cache_free_one+96/a9>
b: 8b 43 10 mov 0x10(%ebx),%eax
Code; c0131f79 <kmem_cache_free_one+99/a9>
e: 8d 53 10 lea 0x10(%ebx),%edx
Code; c0131f7c <kmem_cache_free_one+9c/a9>
11: 89 48 04 mov %ecx,0x4(%eax)
I've got similar oopses already when running 2.4.20-xfs ; the code
path wasn't the same, but it failed in kmem_cache_free_one too. It
doesn't happen on a regular basis at all, and I couldn't reproduce any
of the oopses so far...
I don't know what the machine was doing when this oops happened, it
could have been during an apt-get upgrade (judging by the hour -- I
discovered the oops yesterday when I logged on the machine).
With 2.4.20-xfs, I got similar oopses twice, once rm -rf'ing a kernel
tree on a software RAID 1 array, the second time I was running rsync
on a kernel tree on the same array. For the oops above, I'm pretty
sure this array isn't involved, as it's only storage and I did nothing
involving the array for the past two weeks.
Could that be a bug in the XFS code, or would it be more like a
hardware problem ? (everything is running fine even under heavy load,
otherwise...)
Thanks,
JB.
--
Julien BLACHE <http://www.jblache.org>
<jb@xxxxxxxxxxx>
|