Daryl,
We tried setting up something similar in out lab.
We ended up with a big mess as well.
We decided that what we really needed to do was ignore that Promise Controller
and go get a 3ware controller.
They are very reasonably priced. The 3ware controller is supported in the base
kernel without any patches.
It is also certified by Redhat.
To be honest we have not actually done this yet, but all indications show that
this is the way to go.
Greg
>> Hi,
>> I have a box that had been running RH 7.1 + XFS for a year without
>> a single problem. Recently, I put 3 120G Maxtor 4G120J6 drives on the
>> onboard Promise Controller (pdc202xx) and did software RAID 5. And wow,
>> have the problems started! I went from running a 2.4.3 something kernel
>> to a custom compiled 2.4.18 w/ SGI patch dated March 3. Still no luck.
>> It seems under heavy NFS load, that these problems will start occuring.
>> Any thoughts? I have been trying to follow the ongoing NFS +
>> XFS + RAID 5 discussions, but I am not sure where I should be at
>> regarding kernel + patches.
>> My ksymoops is below.
>> Thanks! Daryl
>> Other info that may be helpful.
>> # lspci
>> 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev
>>
>> 03)
>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
>> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South]
>> (rev 40)
>> 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
>> 00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
>> (rev 40)
>> 00:08.0 VGA compatible controller: Trident Microsystems Blade 3D PCI/AGP
>> (rev 3a)
>> 00:0c.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone]
>> (rev 30)
>> 00:0f.0 RAID bus controller: Promise Technology, Inc. 20265 (rev 02)
>> # free
>> total used free shared buffers cached
>> Mem: 900644 897436 3208 0 0 842932
>> -/+ buffers/cache: 54504 846140
>> Swap: 1028152 0 1028152
>> # cat /proc/ide/pdc202xx
>> PDC20265 Chipset.
>> ------------------------------- General Status
>> ---------------------------------
>> Burst Mode : enabled
>> Host Mode : Normal
>> Bus Clocking : 33 PCI Internal
>> IO pad select : 10 mA
>> Status Polling Period : 0
>> Interrupt Check Status Polling Delay : 2
>> --------------- Primary Channel ---------------- Secondary Channel
>> -------------
>> enabled enabled
>> 66 Clocking enabled enabled
>> Mode MASTER Mode MASTER
>> FIFO Empty ????????????
>> --------------- drive0 --------- drive1 -------- drive0 ---------- drive1
>> ------
>> DMA enabled: no yes yes yes
>> DMA Mode: NOTSET UDMA 4 UDMA 4 UDMA 4
>> PIO Mode: NOTSET PIO ? PIO ? PIO ?
>> Here is my ksymoops from this morning's crash.
>> >>EIP; c013f736 <iput+26/1a0> <=====
>> Trace; c013d70c <prune_dcache+cc/130>
>> Trace; c0126e78 <kmem_find_general_cachep+cf8/1760>
>> Trace; c013da00 <shrink_dcache_parent+50/60>
>> Trace; c0127037 <kmem_find_general_cachep+eb7/1760>
>> Trace; c012708c <kmem_find_general_cachep+f0c/1760>
>> Trace; c0127141 <kmem_find_general_cachep+fc1/1760>
>> Trace; c01271b6 <kmem_find_general_cachep+1036/1760>
>> Trace; c0127311 <kmem_find_general_cachep+1191/1760>
>> Trace; c0127270 <kmem_find_general_cachep+10f0/1760>
>> Trace; c0105000 <gdt+4dcc/4f4c>
>> Trace; c0105536 <kernel_thread+26/1d0>
>> Trace; c0127270 <kmem_find_general_cachep+10f0/1760>
>> Code; c013f736 <iput+26/1a0>
>> 00000000 <_EIP>:
>> Code; c013f736 <iput+26/1a0> <=====
>> 0: 8b 46 20 mov 0x20(%esi),%eax <=====
>> Code; c013f739 <iput+29/1a0>
>> 3: 85 c0 test %eax,%eax
>> Code; c013f73b <iput+2b/1a0>
>> 5: 0f 45 f8 cmovne %eax,%edi
>> Code; c013f73e <iput+2e/1a0>
>> 8: 85 ff test %edi,%edi
>> Code; c013f740 <iput+30/1a0>
>> a: 74 0b je 17 <_EIP+0x17> c013f74d
>> <iput+3d/1a0>
>> Code; c013f742 <iput+32/1a0>
>> c: 8b 47 10 mov 0x10(%edi),%eax
>> Code; c013f745 <iput+35/1a0>
>> f: 85 c0 test %eax,%eax
>> Code; c013f747 <iput+37/1a0>
>> 11: 74 04 je 17 <_EIP+0x17> c013f74d
>> <iput+3d/1a0>
>> Code; c013f749 <iput+39/1a0>
>> 13: 53 push %ebx
>> and then the next immediate oops
>> >>EIP; c013f736 <iput+26/1a0> <=====
>> Trace; c013d70c <prune_dcache+cc/130>
>> Trace; c0126e78 <kmem_find_general_cachep+cf8/1760>
>> Trace; c013da00 <shrink_dcache_parent+50/60>
>> Trace; c0127037 <kmem_find_general_cachep+eb7/1760>
>> Trace; c012708c <kmem_find_general_cachep+f0c/1760>
>> Trace; c0127951 <_alloc_pages+71/1c0>
>> Trace; c0127bbb <__alloc_pages+11b/180>
>> Trace; c0127c30 <__get_free_pages+10/20>
>> Trace; c0139d73 <__pollwait+33/1040>
>> Trace; c025137e <ip_cmsg_recv+199e/182c0>
>> Trace; c023abdf <sock_recvmsg+3df/640>
>> Trace; c0139fcb <__pollwait+28b/1040>
>> Trace; c013a43c <__pollwait+6fc/1040>
>> Trace; c0106cfb <__up_wakeup+110f/23e4>
>> Code; c013f736 <iput+26/1a0>
>> 00000000 <_EIP>:
>> Code; c013f736 <iput+26/1a0> <=====
>> 0: 8b 46 20 mov 0x20(%esi),%eax <=====
>> Code; c013f739 <iput+29/1a0>
>> 3: 85 c0 test %eax,%eax
>> Code; c013f73b <iput+2b/1a0>
>> 5: 0f 45 f8 cmovne %eax,%edi
>> Code; c013f73e <iput+2e/1a0>
>> 8: 85 ff test %edi,%edi
>> Code; c013f740 <iput+30/1a0>
>> a: 74 0b je 17 <_EIP+0x17> c013f74d
>> <iput+3d/1a0>
>> Code; c013f742 <iput+32/1a0>
>> c: 8b 47 10 mov 0x10(%edi),%eax
>> Code; c013f745 <iput+35/1a0>
>> f: 85 c0 test %eax,%eax
>> Code; c013f747 <iput+37/1a0>
>> 11: 74 04 je 17 <_EIP+0x17> c013f74d
>> <iput+3d/1a0>
>> Code; c013f749 <iput+39/1a0>
>> 13: 53 push %ebx
>> --
>> /**
>> * Daryl Herzmann (akrherz@xxxxxxxxxxx)
>> * Program Assistant -- Iowa Environmental Mesonet
>> * http://mesonet.agron.iastate.edu
>> */
Greg Freemyer
Internet Engineer
Deployment and Integration Specialist
The Norcross Group
www.NorcrossGroup.com
|