xfs
[Top] [All Lists]

Re: Oops 2.4.18 + RAID5

To: Daryl Herzmann <akrherz@xxxxxxxxxxx>
Subject: Re: Oops 2.4.18 + RAID5
From: Simon Matter <simon.matter@xxxxxxxxxxxxxxxx>
Date: Wed, 10 Apr 2002 20:18:55 +0200
>received: from mobile.sauter-bc.com (unknown [10.1.6.21]) by basel1.sauter-bc.com (Postfix) with ESMTP id 01BEF57306; Wed, 10 Apr 2002 20:18:57 +0200 (CEST)
Cc: linux-xfs@xxxxxxxxxxx
References: <Pine.LNX.4.42.0204101201051.15039-100000@akrherz.agron.iastate.edu>
Sender: owner-linux-xfs@xxxxxxxxxxx
I've been running a similar server for almost a year now.
It's RH 7.1 + XFS, Promise Ultra100TX2 with 4 Quantum 15G drives,
software RAID5. I have a second server, RH 7.2 + XFS, Promise
Ultra100TX2 with 4 IBM 60G drives, software RAID5.

No problems so far, only 3 of 8 IBM drives dead, but that's another
story.
Can you try a current RH based kernel with XFS? At least for me it has
always worked very well.

ftp://oss.sgi.com/projects/xfs/download/testing/xfs-1.1/kernel_rpms/2.4.9-31-RH/

This is from the small server:

[root@xxl pub]# cat /proc/ide/pdc202xx
 
                                PDC20268 TX2 Chipset.
------------------------------- General Status
---------------------------------
Burst Mode                           : enabled
Host Mode                            : Tri-Stated
Bus Clocking                         : 100 External
IO pad select                        : 10 mA
Status Polling Period                : 15
Interrupt Check Status Polling Delay : 15
--------------- Primary Channel ---------------- Secondary Channel
-------------
                enabled                          enabled
66 Clocking     enabled                          enabled
           Mode MASTER                      Mode MASTER
--------------- drive0 --------- drive1 -------- drive0 ----------
drive1 ------
DMA enabled:    yes              yes             yes               yes
--------------- Cannot Decode HOST ---------------

[root@xxl pub]# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3] (rev
04)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo
MVP3/Pro133x AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C586/A/B PCI-to-ISA
[Apollo VP] (rev 47)
00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:07.3 Host bridge: VIA Technologies, Inc. VT82C586B ACPI (rev 10)
00:08.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone]
(rev 64)
00:09.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100]
(rev 08)
00:0a.0 Unknown mass storage controller: Promise Technology, Inc.:
Unknown device 4d68 (rev 01)
00:0b.0 SCSI storage controller: Adaptec AIC-7881U
01:00.0 VGA compatible controller: S3 Inc. Savage 4 (rev 03)

[root@xxl pub]# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md8 : active raid5 hde10[0] hdf7[2] hdg10[1] hdh7[3]
      31406976 blocks level 5, 128k chunk, algorithm 0 [4/4] [UUUU]
 
md7 : active raid1 hdf6[0] hdh6[1]
      1024000 blocks [2/2] [UU]
 
md5 : active raid1 hdf5[0] hdh5[1]
      3072256 blocks [2/2] [UU]
 
md0 : active raid1 hde1[0] hdf1[1] hdg1[2] hdh1[3]
      102720 blocks [4/4] [UUUU]
 
md6 : active raid1 hde9[0] hdg9[1]
      1024000 blocks [2/2] [UU]
 
md4 : active raid1 hde8[0] hdg8[1]
      511936 blocks [2/2] [UU]
 
md3 : active raid1 hde7[0] hdg7[1]
      511936 blocks [2/2] [UU]
 
md2 : active raid1 hde6[0] hdg6[1]
      1755328 blocks [2/2] [UU]
 
md1 : active raid1 hde5[0] hdg5[1]
      292672 blocks [2/2] [UU]
 
unused devices: <none>



Daryl Herzmann schrieb:
> 
> Hi,
>         I have a box that had been running RH 7.1 + XFS for a year without
> a single problem.  Recently, I put 3 120G Maxtor 4G120J6 drives on the
> onboard Promise Controller (pdc202xx) and did software RAID 5.  And wow,
> have the problems started!  I went from running a 2.4.3 something kernel
> to a custom compiled 2.4.18 w/ SGI patch dated March 3.  Still no luck.
> It seems under heavy NFS load, that these problems will start occuring.
> 
>         Any thoughts?  I have been trying to follow the ongoing NFS +
> XFS + RAID 5 discussions, but I am not sure where I should be at
> regarding kernel + patches.
> 
> My ksymoops is below.
> 
> Thanks! Daryl
> 
> Other info that may be helpful.
> 
> # lspci
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev
> 03)
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South]
> (rev 40)
> 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
> 00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
> (rev 40)
> 00:08.0 VGA compatible controller: Trident Microsystems Blade 3D PCI/AGP
> (rev 3a)
> 00:0c.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone]
> (rev 30)
> 00:0f.0 RAID bus controller: Promise Technology, Inc. 20265 (rev 02)
> 
> # free
>              total       used       free     shared    buffers     cached
> Mem:        900644     897436       3208          0          0     842932
> -/+ buffers/cache:      54504     846140
> Swap:      1028152          0    1028152
> 
> # cat /proc/ide/pdc202xx
> 
>                                 PDC20265 Chipset.
> ------------------------------- General Status
> ---------------------------------
> Burst Mode                           : enabled
> Host Mode                            : Normal
> Bus Clocking                         : 33 PCI Internal
> IO pad select                        : 10 mA
> Status Polling Period                : 0
> Interrupt Check Status Polling Delay : 2
> --------------- Primary Channel ---------------- Secondary Channel
> -------------
>                 enabled                          enabled
> 66 Clocking     enabled                          enabled
>            Mode MASTER                      Mode MASTER
>                 FIFO Empty                       ????????????
> --------------- drive0 --------- drive1 -------- drive0 ---------- drive1
> ------
> DMA enabled:    no               yes             yes               yes
> DMA Mode:       NOTSET           UDMA 4          UDMA 4            UDMA 4
> PIO Mode:       NOTSET            PIO ?           PIO ?            PIO ?
> 
> Here is my ksymoops from this morning's crash.
> 
> >>EIP; c013f736 <iput+26/1a0>   <=====
> Trace; c013d70c <prune_dcache+cc/130>
> Trace; c0126e78 <kmem_find_general_cachep+cf8/1760>
> Trace; c013da00 <shrink_dcache_parent+50/60>
> Trace; c0127037 <kmem_find_general_cachep+eb7/1760>
> Trace; c012708c <kmem_find_general_cachep+f0c/1760>
> Trace; c0127141 <kmem_find_general_cachep+fc1/1760>
> Trace; c01271b6 <kmem_find_general_cachep+1036/1760>
> Trace; c0127311 <kmem_find_general_cachep+1191/1760>
> Trace; c0127270 <kmem_find_general_cachep+10f0/1760>
> Trace; c0105000 <gdt+4dcc/4f4c>
> Trace; c0105536 <kernel_thread+26/1d0>
> Trace; c0127270 <kmem_find_general_cachep+10f0/1760>
> Code;  c013f736 <iput+26/1a0>
> 00000000 <_EIP>:
> Code;  c013f736 <iput+26/1a0>   <=====
>    0:   8b 46 20                  mov    0x20(%esi),%eax   <=====
> Code;  c013f739 <iput+29/1a0>
>    3:   85 c0                     test   %eax,%eax
> Code;  c013f73b <iput+2b/1a0>
>    5:   0f 45 f8                  cmovne %eax,%edi
> Code;  c013f73e <iput+2e/1a0>
>    8:   85 ff                     test   %edi,%edi
> Code;  c013f740 <iput+30/1a0>
>    a:   74 0b                     je     17 <_EIP+0x17> c013f74d
> <iput+3d/1a0>
> Code;  c013f742 <iput+32/1a0>
>    c:   8b 47 10                  mov    0x10(%edi),%eax
> Code;  c013f745 <iput+35/1a0>
>    f:   85 c0                     test   %eax,%eax
> Code;  c013f747 <iput+37/1a0>
>   11:   74 04                     je     17 <_EIP+0x17> c013f74d
> <iput+3d/1a0>
> Code;  c013f749 <iput+39/1a0>
>   13:   53                        push   %ebx
> 
> and then the next immediate oops
> 
> >>EIP; c013f736 <iput+26/1a0>   <=====
> Trace; c013d70c <prune_dcache+cc/130>
> Trace; c0126e78 <kmem_find_general_cachep+cf8/1760>
> Trace; c013da00 <shrink_dcache_parent+50/60>
> Trace; c0127037 <kmem_find_general_cachep+eb7/1760>
> Trace; c012708c <kmem_find_general_cachep+f0c/1760>
> Trace; c0127951 <_alloc_pages+71/1c0>
> Trace; c0127bbb <__alloc_pages+11b/180>
> Trace; c0127c30 <__get_free_pages+10/20>
> Trace; c0139d73 <__pollwait+33/1040>
> Trace; c025137e <ip_cmsg_recv+199e/182c0>
> Trace; c023abdf <sock_recvmsg+3df/640>
> Trace; c0139fcb <__pollwait+28b/1040>
> Trace; c013a43c <__pollwait+6fc/1040>
> Trace; c0106cfb <__up_wakeup+110f/23e4>
> Code;  c013f736 <iput+26/1a0>
> 00000000 <_EIP>:
> Code;  c013f736 <iput+26/1a0>   <=====
>    0:   8b 46 20                  mov    0x20(%esi),%eax   <=====
> Code;  c013f739 <iput+29/1a0>
>    3:   85 c0                     test   %eax,%eax
> Code;  c013f73b <iput+2b/1a0>
>    5:   0f 45 f8                  cmovne %eax,%edi
> Code;  c013f73e <iput+2e/1a0>
>    8:   85 ff                     test   %edi,%edi
> Code;  c013f740 <iput+30/1a0>
>    a:   74 0b                     je     17 <_EIP+0x17> c013f74d
> <iput+3d/1a0>
> Code;  c013f742 <iput+32/1a0>
>    c:   8b 47 10                  mov    0x10(%edi),%eax
> Code;  c013f745 <iput+35/1a0>
>    f:   85 c0                     test   %eax,%eax
> Code;  c013f747 <iput+37/1a0>
>   11:   74 04                     je     17 <_EIP+0x17> c013f74d
> <iput+3d/1a0>
> Code;  c013f749 <iput+39/1a0>
>   13:   53                        push   %ebx
> 
> --
> /**
>  * Daryl Herzmann (akrherz@xxxxxxxxxxx)
>  * Program Assistant -- Iowa Environmental Mesonet
>  * http://mesonet.agron.iastate.edu
>  */



<Prev in Thread] Current Thread [Next in Thread>