Hi,
I have a box that had been running RH 7.1 + XFS for a year without
a single problem. Recently, I put 3 120G Maxtor 4G120J6 drives on the
onboard Promise Controller (pdc202xx) and did software RAID 5. And wow,
have the problems started! I went from running a 2.4.3 something kernel
to a custom compiled 2.4.18 w/ SGI patch dated March 3. Still no luck.
It seems under heavy NFS load, that these problems will start occuring.
Any thoughts? I have been trying to follow the ongoing NFS +
XFS + RAID 5 discussions, but I am not sure where I should be at
regarding kernel + patches.
My ksymoops is below.
Thanks! Daryl
Other info that may be helpful.
# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev
03)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South]
(rev 40)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
(rev 40)
00:08.0 VGA compatible controller: Trident Microsystems Blade 3D PCI/AGP
(rev 3a)
00:0c.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone]
(rev 30)
00:0f.0 RAID bus controller: Promise Technology, Inc. 20265 (rev 02)
# free
total used free shared buffers cached
Mem: 900644 897436 3208 0 0 842932
-/+ buffers/cache: 54504 846140
Swap: 1028152 0 1028152
# cat /proc/ide/pdc202xx
PDC20265 Chipset.
------------------------------- General Status
---------------------------------
Burst Mode : enabled
Host Mode : Normal
Bus Clocking : 33 PCI Internal
IO pad select : 10 mA
Status Polling Period : 0
Interrupt Check Status Polling Delay : 2
--------------- Primary Channel ---------------- Secondary Channel
-------------
enabled enabled
66 Clocking enabled enabled
Mode MASTER Mode MASTER
FIFO Empty ????????????
--------------- drive0 --------- drive1 -------- drive0 ---------- drive1
------
DMA enabled: no yes yes yes
DMA Mode: NOTSET UDMA 4 UDMA 4 UDMA 4
PIO Mode: NOTSET PIO ? PIO ? PIO ?
Here is my ksymoops from this morning's crash.
>>EIP; c013f736 <iput+26/1a0> <=====
Trace; c013d70c <prune_dcache+cc/130>
Trace; c0126e78 <kmem_find_general_cachep+cf8/1760>
Trace; c013da00 <shrink_dcache_parent+50/60>
Trace; c0127037 <kmem_find_general_cachep+eb7/1760>
Trace; c012708c <kmem_find_general_cachep+f0c/1760>
Trace; c0127141 <kmem_find_general_cachep+fc1/1760>
Trace; c01271b6 <kmem_find_general_cachep+1036/1760>
Trace; c0127311 <kmem_find_general_cachep+1191/1760>
Trace; c0127270 <kmem_find_general_cachep+10f0/1760>
Trace; c0105000 <gdt+4dcc/4f4c>
Trace; c0105536 <kernel_thread+26/1d0>
Trace; c0127270 <kmem_find_general_cachep+10f0/1760>
Code; c013f736 <iput+26/1a0>
00000000 <_EIP>:
Code; c013f736 <iput+26/1a0> <=====
0: 8b 46 20 mov 0x20(%esi),%eax <=====
Code; c013f739 <iput+29/1a0>
3: 85 c0 test %eax,%eax
Code; c013f73b <iput+2b/1a0>
5: 0f 45 f8 cmovne %eax,%edi
Code; c013f73e <iput+2e/1a0>
8: 85 ff test %edi,%edi
Code; c013f740 <iput+30/1a0>
a: 74 0b je 17 <_EIP+0x17> c013f74d
<iput+3d/1a0>
Code; c013f742 <iput+32/1a0>
c: 8b 47 10 mov 0x10(%edi),%eax
Code; c013f745 <iput+35/1a0>
f: 85 c0 test %eax,%eax
Code; c013f747 <iput+37/1a0>
11: 74 04 je 17 <_EIP+0x17> c013f74d
<iput+3d/1a0>
Code; c013f749 <iput+39/1a0>
13: 53 push %ebx
and then the next immediate oops
>>EIP; c013f736 <iput+26/1a0> <=====
Trace; c013d70c <prune_dcache+cc/130>
Trace; c0126e78 <kmem_find_general_cachep+cf8/1760>
Trace; c013da00 <shrink_dcache_parent+50/60>
Trace; c0127037 <kmem_find_general_cachep+eb7/1760>
Trace; c012708c <kmem_find_general_cachep+f0c/1760>
Trace; c0127951 <_alloc_pages+71/1c0>
Trace; c0127bbb <__alloc_pages+11b/180>
Trace; c0127c30 <__get_free_pages+10/20>
Trace; c0139d73 <__pollwait+33/1040>
Trace; c025137e <ip_cmsg_recv+199e/182c0>
Trace; c023abdf <sock_recvmsg+3df/640>
Trace; c0139fcb <__pollwait+28b/1040>
Trace; c013a43c <__pollwait+6fc/1040>
Trace; c0106cfb <__up_wakeup+110f/23e4>
Code; c013f736 <iput+26/1a0>
00000000 <_EIP>:
Code; c013f736 <iput+26/1a0> <=====
0: 8b 46 20 mov 0x20(%esi),%eax <=====
Code; c013f739 <iput+29/1a0>
3: 85 c0 test %eax,%eax
Code; c013f73b <iput+2b/1a0>
5: 0f 45 f8 cmovne %eax,%edi
Code; c013f73e <iput+2e/1a0>
8: 85 ff test %edi,%edi
Code; c013f740 <iput+30/1a0>
a: 74 0b je 17 <_EIP+0x17> c013f74d
<iput+3d/1a0>
Code; c013f742 <iput+32/1a0>
c: 8b 47 10 mov 0x10(%edi),%eax
Code; c013f745 <iput+35/1a0>
f: 85 c0 test %eax,%eax
Code; c013f747 <iput+37/1a0>
11: 74 04 je 17 <_EIP+0x17> c013f74d
<iput+3d/1a0>
Code; c013f749 <iput+39/1a0>
13: 53 push %ebx
--
/**
* Daryl Herzmann (akrherz@xxxxxxxxxxx)
* Program Assistant -- Iowa Environmental Mesonet
* http://mesonet.agron.iastate.edu
*/
|