[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Consistant oops on dual athlon boards



Hi All,

Long time XFS user and mailing list lurker.

Over the past few months, I've been able to consistantly crash 
XFS-enabled kernels with a 35Gb rsync(or any high disk I/O) to a dual 
AMD server. I'd have reported it earlier, but have been learning how to 
properly debug and troubleshoot the kernel, which I'm hopefully doing.

Anyways, the most recent oops on 2.4.21 with pre4 of 1.3:

---------
Unable to handle kernel paging request at virtual address 7461687b
c01d3430
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c01d3430>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010a83
eax: 00000000   ebx: dedf5640   ecx: 74616863   edx: 72757465
esi: c15aff18   edi: c15aff14   ebp: c15aff10   esp: c15afeec
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 5, stackpage=c15af000)
Stack: c124e974 d44dae20 00003479 c03003d4 c01d35c5 c124e974 c15aff10 
c15aff14
        c15aff18 00000000 00000001 00000000 c124e974 000001d0 c013b611 
c124e974
        000001d0 00000000 c124e974 c0130c27 c124e974 000001d0 c15ae000 
000001fc
Call Trace:    [<c01d35c5>] [<c013b611>] [<c0130c27>] [<c0130e61>] 
[<c0130ed6>]
   [<c0130ff4>] [<c0131059>] [<c013116d>] [<c01310e0>] [<c0105000>] 
[<c010590b>]
   [<c01310e0>]
Code: 8b 51 18 89 d0 83 e0 11 48 74 25 f6 c6 10 74 12 c7 06 01 00

 >>EIP; c01d3430 <count_page_state+30/70>   <=====
Trace; c01d35c5 <linvfs_release_page+35/90>
Trace; c013b611 <try_to_release_page+51/70>
Trace; c0130c27 <shrink_cache+2b7/390>
Trace; c0130e61 <shrink_caches+61/a0>
Trace; c0130ed6 <try_to_free_pages_zone+36/60>
Trace; c0130ff4 <kswapd_balance_pgdat+54/a0>
Trace; c0131059 <kswapd_balance+19/30>
Trace; c013116d <kswapd+8d/b0>
Trace; c01310e0 <kswapd+0/b0>
Trace; c0105000 <_stext+0/0>
Trace; c010590b <arch_kernel_thread+2b/40>
Trace; c01310e0 <kswapd+0/b0>
Code;  c01d3430 <count_page_state+30/70>
00000000 <_EIP>:
Code;  c01d3430 <count_page_state+30/70>   <=====
    0:   8b 51 18                  mov    0x18(%ecx),%edx   <=====
Code;  c01d3433 <count_page_state+33/70>
    3:   89 d0                     mov    %edx,%eax
Code;  c01d3435 <count_page_state+35/70>
    5:   83 e0 11                  and    $0x11,%eax
Code;  c01d3438 <count_page_state+38/70>
    8:   48                        dec    %eax
Code;  c01d3439 <count_page_state+39/70>
    9:   74 25                     je     30 <_EIP+0x30> c01d3460 
<count_page_state+60/70>
Code;  c01d343b <count_page_state+3b/70>
    b:   f6 c6 10                  test   $0x10,%dh
Code;  c01d343e <count_page_state+3e/70>
    e:   74 12                     je     22 <_EIP+0x22> c01d3452 
<count_page_state+52/70>
Code;  c01d3440 <count_page_state+40/70>
   10:   c7 06 01 00 00 00         movl   $0x1,(%esi)

----------

This issue similar to above started back in January with a Tyan Thunder 
K7X (S2468ugn) with 2.4.18 and XFS 1.2. I've tried every kernel release 
with every possible XFS release, and the situation remains. I've since 
replaced both CPU's, all the memory(and memtested the new stuff), and 
put in a new motherboard from a different mfg. (Asus dual amd)

I've also been running 2.6.0-test1 and test2 with XFS enabled and am 
still able to get the kernel panic. When not under high load, the 
machine runs fine, and when I run a non-XFS kernel with an ext3 FS on 
the same hardware, everything works fine as well.

I'd give more detail, and hopefully this is enough to get the ball 
rolling, but if I can provide more detail or info, or give any more oops 
reports from various kernels/xfs versions, let me know...

Thanks for a great product!

Dan
http://five2one.org/