Hi All,
Long time XFS user and mailing list lurker.
Over the past few months, I've been able to consistantly crash
XFS-enabled kernels with a 35Gb rsync(or any high disk I/O) to a dual
AMD server. I'd have reported it earlier, but have been learning how to
properly debug and troubleshoot the kernel, which I'm hopefully doing.
Anyways, the most recent oops on 2.4.21 with pre4 of 1.3:
---------
Unable to handle kernel paging request at virtual address 7461687b
c01d3430
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01d3430>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010a83
eax: 00000000 ebx: dedf5640 ecx: 74616863 edx: 72757465
esi: c15aff18 edi: c15aff14 ebp: c15aff10 esp: c15afeec
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 5, stackpage=c15af000)
Stack: c124e974 d44dae20 00003479 c03003d4 c01d35c5 c124e974 c15aff10
c15aff14
c15aff18 00000000 00000001 00000000 c124e974 000001d0 c013b611
c124e974
000001d0 00000000 c124e974 c0130c27 c124e974 000001d0 c15ae000
000001fc
Call Trace: [<c01d35c5>] [<c013b611>] [<c0130c27>] [<c0130e61>]
[<c0130ed6>]
[<c0130ff4>] [<c0131059>] [<c013116d>] [<c01310e0>] [<c0105000>]
[<c010590b>]
[<c01310e0>]
Code: 8b 51 18 89 d0 83 e0 11 48 74 25 f6 c6 10 74 12 c7 06 01 00
>>EIP; c01d3430 <count_page_state+30/70> <=====
Trace; c01d35c5 <linvfs_release_page+35/90>
Trace; c013b611 <try_to_release_page+51/70>
Trace; c0130c27 <shrink_cache+2b7/390>
Trace; c0130e61 <shrink_caches+61/a0>
Trace; c0130ed6 <try_to_free_pages_zone+36/60>
Trace; c0130ff4 <kswapd_balance_pgdat+54/a0>
Trace; c0131059 <kswapd_balance+19/30>
Trace; c013116d <kswapd+8d/b0>
Trace; c01310e0 <kswapd+0/b0>
Trace; c0105000 <_stext+0/0>
Trace; c010590b <arch_kernel_thread+2b/40>
Trace; c01310e0 <kswapd+0/b0>
Code; c01d3430 <count_page_state+30/70>
00000000 <_EIP>:
Code; c01d3430 <count_page_state+30/70> <=====
0: 8b 51 18 mov 0x18(%ecx),%edx <=====
Code; c01d3433 <count_page_state+33/70>
3: 89 d0 mov %edx,%eax
Code; c01d3435 <count_page_state+35/70>
5: 83 e0 11 and $0x11,%eax
Code; c01d3438 <count_page_state+38/70>
8: 48 dec %eax
Code; c01d3439 <count_page_state+39/70>
9: 74 25 je 30 <_EIP+0x30> c01d3460
<count_page_state+60/70>
Code; c01d343b <count_page_state+3b/70>
b: f6 c6 10 test $0x10,%dh
Code; c01d343e <count_page_state+3e/70>
e: 74 12 je 22 <_EIP+0x22> c01d3452
<count_page_state+52/70>
Code; c01d3440 <count_page_state+40/70>
10: c7 06 01 00 00 00 movl $0x1,(%esi)
----------
This issue similar to above started back in January with a Tyan Thunder
K7X (S2468ugn) with 2.4.18 and XFS 1.2. I've tried every kernel release
with every possible XFS release, and the situation remains. I've since
replaced both CPU's, all the memory(and memtested the new stuff), and
put in a new motherboard from a different mfg. (Asus dual amd)
I've also been running 2.6.0-test1 and test2 with XFS enabled and am
still able to get the kernel panic. When not under high load, the
machine runs fine, and when I run a non-XFS kernel with an ext3 FS on
the same hardware, everything works fine as well.
I'd give more detail, and hopefully this is enough to get the ball
rolling, but if I can provide more detail or info, or give any more oops
reports from various kernels/xfs versions, let me know...
Thanks for a great product!
Dan
http://five2one.org/
|