Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 21 May 2003 14:03:47 -0700 (PDT) Received: from rj.sgi.com (rj.SGI.COM [192.82.208.96]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h4LL3A2x022039 for ; Wed, 21 May 2003 14:03:13 -0700 Received: from ledzep.americas.sgi.com (ledzep.americas.sgi.com [192.48.203.134]) by rj.sgi.com (8.12.9/8.12.2/linux-outbound_gateway-1.2) with ESMTP id h4LL34E0024723 for ; Wed, 21 May 2003 14:03:04 -0700 Received: from daisy-e236.americas.sgi.com (daisy-e236.americas.sgi.com [128.162.236.214]) by ledzep.americas.sgi.com (8.12.9/americas-smart-nospam1.1) with ESMTP id h4LL32a224405374; Wed, 21 May 2003 16:03:02 -0500 (CDT) Received: from jen.americas.sgi.com (jen.americas.sgi.com [128.162.232.100]) by daisy-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id h4LL33Rn82127415; Wed, 21 May 2003 16:03:03 -0500 (CDT) Received: by jen.americas.sgi.com (8.11.6/SGI-client-1.7) id h4LL32A23325; Wed, 21 May 2003 16:03:02 -0500 Subject: Re: ooops report 2.4.20-xfs-CVS-2003-02-21_06:00_UTC From: Steve Lord To: "Jeffrey E. Hundstad" Cc: linux-xfs@oss.sgi.com In-Reply-To: <3ECBB68E.2030306@mnsu.edu> References: <3ECBB68E.2030306@mnsu.edu> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1053550982.21472.0.camel@jen.americas.sgi.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Date: 21 May 2003 16:03:02 -0500 X-archive-position: 4101 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: lord@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 4523 Lines: 127 On Wed, 2003-05-21 at 12:25, Jeffrey E. Hundstad wrote: > Hello, There have been major changes in the sync code in the last few weeks (after the code you are running). Please try a new kernel from CVS. Steve > > We had a machine hang on us twice since upgrading from 2.4.18-xfs. It > ran for months without any problems. We upgraded to > 2.4.20-xfs-CVS-2003-02-21_06:00_UTC and patched it for the ptrace > vurnerablility. When the machine hangs it has a VERY high load average > ~200 and as soon as you issue a disk-io request of any kind your process > hangs. The hang has been happening during a cpio backup of the system. > > This last time we caught some/most of an ooops. I'll include the > ksymoops output and the raw input. I think we may be able to glean more > information from the report, but I'm not really familiar with the ooops > output. Let me know if I can do anything to make the report better. > > Here's some info. about the computer. > > CPU: Pentium III (Coppermine) x 2 x 667 MHz > Compiler: gcc version 2.95.4 20011002 (Debian prerelease) > Drives: software raid1 - 2xMaxtor 4G160J8-ide drives (with the write > cache turned off) > > ksymoops'ed report: > > ksymoops 2.4.9 on i686 2.4.20-xfs. Options used > -V (default) > -k /var/log/ksymoops/20030521062552.ksyms (specified) > -l /var/log/ksymoops/20030521062552.modules (specified) > -o /lib/modules/2.4.20-xfs (specified) > -m /boot/System.map-2.4.20-xfs (specified) > > Code: 39 70 30 0f 85 9d 00 00 00 39 78 34 0f 85 94 00 00 00 8b 50 > Using defaults from ksymoops -t elf32-i386 -a i386 > > > Code; 00000000 Before first symbol > 00000000 <_EIP>: > Code; 00000000 Before first symbol > 0: 39 70 30 cmp %esi,0x30(%eax) > Code; 00000003 Before first symbol > 3: 0f 85 9d 00 00 00 jne a6 <_EIP+0xa6> 000000a6 Before > first symbol > Code; 00000009 Before first symbol > 9: 39 78 34 cmp %edi,0x34(%eax) > Code; 0000000c Before first symbol > c: 0f 85 94 00 00 00 jne a6 <_EIP+0xa6> 000000a6 Before > first symbol > Code; 00000012 Before first symbol > 12: 8b 50 00 mov 0x0(%eax),%edx > > <1> unable to handle kernel paging request at virtual address a4446e2b > c01ce450 > *pde = 00000000 > CPU: 0 > EIP: 0010:[] Not tainted > EFLAGS: 00010286 > Stack: ded82c00 ded82c48 00000000 > 00000000 > d6054840 > Call Trace: [] [] [] > [] [] > [] [] > Code: 83 7b 14 00 0f 84 79 07 00 00 8b 6b 1c 85 ed 0f 85 db 00 00 > > > >>EIP; c01ce450 <===== > > Trace; c01d3900 > Trace; c01ce315 > Trace; c01e0c73 > Trace; c013d5da > Trace; c013c7b0 > Trace; c013ca9a > Trace; c0107134 > > Code; c01ce450 > 00000000 <_EIP>: > Code; c01ce450 <===== > 0: 83 7b 14 00 cmpl $0x0,0x14(%ebx) <===== > Code; c01ce454 > 4: 0f 84 79 07 00 00 je 783 <_EIP+0x783> c01cebd3 > > Code; c01ce45a > a: 8b 6b 1c mov 0x1c(%ebx),%ebp > Code; c01ce45d > d: 85 ed test %ebp,%ebp > Code; c01ce45f > f: 0f 85 db 00 00 00 jne f0 <_EIP+0xf0> c01ce540 > > > > Raw ooops hand copied from screen: > > Code: 39 70 30 0f 85 9d 00 00 00 39 78 34 0f 85 94 00 00 00 8b 50 > <1> unable to handle kernel paging request at virtual address a4446e2b > printing eip: > c01ce450 > *pde = 00000000 > Ooops: 0000 > st sg appletalk eepro100 mii lvm-mod raid5 xor raid1 raid0 linear md aic7xxx > CPU: 0 > EIP: 0010:[] Not tainted > EFLAGS: 00010286 > eax: > es: > ds: > Process kupdated (pid: 7 stackpage = c1621000 > Stack: ded82c00 ded82c48 00000000 > 00000000 > d6054840 > Call Trace: [] [] [] > [] [] > [] [] > Code: 83 7b 14 00 0f 84 79 07 00 00 8b 6b 1c 85 ed 0f 85 db 00 00 -- Steve Lord voice: +1-651-683-3511 Principal Engineer, Filesystem Software email: lord@sgi.com