From owner-lkcd@oss.sgi.com Tue Apr 18 12:27:38 2000 Received: by oss.sgi.com id ; Tue, 18 Apr 2000 12:27:29 -0700 Received: from zmamail04.zma.compaq.com ([161.114.64.104]:57349 "HELO zmamail04.zma.compaq.com") by oss.sgi.com with SMTP id ; Tue, 18 Apr 2000 12:27:17 -0700 Received: by zmamail04.zma.compaq.com (Postfix, from userid 12345) id A4B4716A; Tue, 18 Apr 2000 15:27:11 -0400 (EDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by zmamail04.zma.compaq.com (Postfix) with SMTP id CF4ED163; Tue, 18 Apr 2000 15:27:10 -0400 (EDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA09265; Tue, 18 Apr 2000 13:27:08 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA20693; Tue, 18 Apr 2000 13:27:08 -0600 Received: by compaq.com (sSMTP sendmail emulation); Tue, 18 Apr 2000 13:22:51 -0600 Content-Length: 1202 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Tue, 18 Apr 2000 13:22:51 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: Alpha lkcd port: bzero problem Cc: axp-list@redhat.com, clinux@zk3.dec.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I have made my attempt at integrating the Alpha debugger from gdb-4.1.8 into lcrash. However, now when I run lcrash (directly at the command line or via a real crash), I get this in gdb: Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash map = /boot/System.map, vmdump = /dev/mem, outfile = stdout Please wait... Program received signal SIGSEGV, Segmentation fault. __bzero () at ../sysdeps/alpha/bzero.S:107 107 ../sysdeps/alpha/bzero.S: No such file or directory. Current language: auto; currently asm (gdb) where #0 __bzero () at ../sysdeps/alpha/bzero.S:107 #1 0x12002b7e8 in alloc_klib () at klib.c:191 #2 0x12002b8f0 in kl_init_klib (map=0x120164c30 "/boot/System.map", vmdump=0x120164f30 "/dev/mem", namelist=0x120164d30 "", flags=0) at klib.c:220 #3 0x1200029b0 in main (argc=1, argv=0x11ffffc28) at main.c:182 >From klib.c, allock_klib: static klib_t * alloc_klib() { klib_t *klp; /* if (klp = (klib_t *)malloc(sizeof(klib_t))) { */ bzero(klp, sizeof(klib_t)); /* }*/ return(klp); } Does bzero work correctly on Alpha? Or are my arguments invalid? -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Tue Apr 18 12:51:18 2000 Received: by oss.sgi.com id ; Tue, 18 Apr 2000 12:50:59 -0700 Received: from ztxmail04.ztx.compaq.com ([161.114.1.208]:45062 "HELO ztxmail04.ztx.compaq.com") by oss.sgi.com with SMTP id ; Tue, 18 Apr 2000 12:50:35 -0700 Received: by ztxmail04.ztx.compaq.com (Postfix, from userid 12345) id 1D01A14F; Tue, 18 Apr 2000 14:50:29 -0500 (CDT) Received: from exctay-gh02.tay.cpqcorp.net (exctay-gh02.tay.cpqcorp.net [16.103.129.52]) by ztxmail04.ztx.compaq.com (Postfix) with ESMTP id D64B92FB; Tue, 18 Apr 2000 14:50:28 -0500 (CDT) Received: by exctay-gh02.tay.cpqcorp.net with Internet Mail Service (5.5.2650.21) id <2X9NK22G>; Tue, 18 Apr 2000 15:50:28 -0400 Message-ID: <9996FB0C6AB3D111B9FB0000F81E38A20731EB66@lkgexc1.tay.dec.com> From: "Evans, Tom" To: "Hall, Brian" Cc: axp-list@redhat.com, clinux@zk3.dec.com, "'lkcd@oss.sgi.com'" Subject: RE: Alpha lkcd port: bzero problem Date: Tue, 18 Apr 2000 15:50:26 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Umm your arguments are invalid. >From klib.c, allock_klib: static klib_t * alloc_klib() { klib_t *klp; /* if (klp = (klib_t *)malloc(sizeof(klib_t))) { */ bzero(klp, sizeof(klib_t)); /* }*/ return(klp); } klp is a stack allocated pointer to a klib_t type, but that klib_t isn't allocated in the context of alloc_klib and was not passed into alloc_klib directly as an argument. So, the klp points to junk (it wasn't initialized and will just be garbage that was on the stack, on other platforms, that stack junk might just happen to be the desired klib_t, but that would just be an artifact of the calling sequence). I'm reasonably sure that bzero is fine, uncommenting the line before and after would probably help because then you would actually have a klib_t to initialize. ...tom -----Original Message----- From: Hall, Brian Sent: Tuesday, April 18, 2000 3:23 PM To: lkcd@oss.sgi.com Cc: axp-list@redhat.com; clinux@zk3.dec.com Subject: Alpha lkcd port: bzero problem I have made my attempt at integrating the Alpha debugger from gdb-4.1.8 into lcrash. However, now when I run lcrash (directly at the command line or via a real crash), I get this in gdb: Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash map = /boot/System.map, vmdump = /dev/mem, outfile = stdout Please wait... Program received signal SIGSEGV, Segmentation fault. __bzero () at ../sysdeps/alpha/bzero.S:107 107 ../sysdeps/alpha/bzero.S: No such file or directory. Current language: auto; currently asm (gdb) where #0 __bzero () at ../sysdeps/alpha/bzero.S:107 #1 0x12002b7e8 in alloc_klib () at klib.c:191 #2 0x12002b8f0 in kl_init_klib (map=0x120164c30 "/boot/System.map", vmdump=0x120164f30 "/dev/mem", namelist=0x120164d30 "", flags=0) at klib.c:220 #3 0x1200029b0 in main (argc=1, argv=0x11ffffc28) at main.c:182 >From klib.c, allock_klib: static klib_t * alloc_klib() { klib_t *klp; /* if (klp = (klib_t *)malloc(sizeof(klib_t))) { */ bzero(klp, sizeof(klib_t)); /* }*/ return(klp); } Does bzero work correctly on Alpha? Or are my arguments invalid? -- http://www.bigfoot.com/~brihall Linux Consultant ********************************************************** To unsubscribe from this list, send mail to majordomo@zk3.dec.com with the following text in the *body* (*not* the subject line) of the letter: unsubscribe clinux ********************************************************** From owner-lkcd@oss.sgi.com Tue Apr 18 13:16:09 2000 Received: by oss.sgi.com id ; Tue, 18 Apr 2000 13:15:59 -0700 Received: from zmamail04.zma.compaq.com ([161.114.64.104]:48141 "HELO zmamail04.zma.compaq.com") by oss.sgi.com with SMTP id ; Tue, 18 Apr 2000 13:07:38 -0700 Received: by zmamail04.zma.compaq.com (Postfix, from userid 12345) id B4A8F27C; Tue, 18 Apr 2000 16:07:32 -0400 (EDT) Received: from fluid.mro.dec.com (fluid.mro.dec.com [16.25.16.129]) by zmamail04.zma.compaq.com (Postfix) with ESMTP id 89C9E21B; Tue, 18 Apr 2000 16:07:32 -0400 (EDT) Received: from toulouse by fluid.mro.dec.com (8.8.8/1.1.8.2/19Nov96-0448PM) id QAA23642; Tue, 18 Apr 2000 16:07:32 -0400 (EDT) From: "Jerry Feldman" Organization: eInfrastructure Partner Engineering To: "'lkcd@oss.sgi.com'" , "Hall, Brian" Date: Tue, 18 Apr 2000 16:07:58 -0400 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: RE: Alpha lkcd port: bzero problem Cc: clinux@zk3.dec.com, "'lkcd@oss.sgi.com'" Message-ID: <38FC885E.23644.1FC481C5@localhost> In-reply-to: <9996FB0C6AB3D111B9FB0000F81E38A20731EB66@lkgexc1.tay.dec.com> X-mailer: Pegasus Mail for Win32 (v3.12c) Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing The problem here is with the bzero. First, as top stated, the pointer, klp is uninitialized. bzero attempts to write zeros to whatever klp points to for a length of sizeof(klib_t ). As Tom states in his reply, uncommenting the lines before and after will improve the situation. On 18 Apr 2000, at 15:50, Evans, Tom wrote: > > Umm your arguments are invalid. > From klib.c, allock_klib: > > static klib_t * > alloc_klib() > { > klib_t *klp; > > /* if (klp = (klib_t *)malloc(sizeof(klib_t))) { */ > bzero(klp, sizeof(klib_t)); > /* }*/ > return(klp); > } > > > klp is a stack allocated pointer to a klib_t type, but that klib_t > isn't allocated in the context of alloc_klib and was not passed > into alloc_klib directly as an argument. > > So, the klp points to junk (it wasn't initialized > and will just be garbage that was on the stack, on other platforms, > that stack junk might just happen to be the desired klib_t, > but that would just be an artifact of the calling sequence). > > I'm reasonably sure that bzero is fine, uncommenting the line before and > after > would probably help because then you would actually have a klib_t to > initialize. > > ...tom > > -----Original Message----- > From: Hall, Brian > Sent: Tuesday, April 18, 2000 3:23 PM > To: lkcd@oss.sgi.com > Cc: axp-list@redhat.com; clinux@zk3.dec.com > Subject: Alpha lkcd port: bzero problem > > > I have made my attempt at integrating the Alpha debugger from gdb-4.1.8 into > lcrash. However, now when I run lcrash (directly at the command line or via > a > real crash), I get this in gdb: > > Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash > map = /boot/System.map, vmdump = /dev/mem, outfile = stdout > > Please wait... > Program received signal SIGSEGV, Segmentation fault. > __bzero () at ../sysdeps/alpha/bzero.S:107 > 107 ../sysdeps/alpha/bzero.S: No such file or directory. > Current language: auto; currently asm > (gdb) where > #0 __bzero () at ../sysdeps/alpha/bzero.S:107 > #1 0x12002b7e8 in alloc_klib () at klib.c:191 > #2 0x12002b8f0 in kl_init_klib (map=0x120164c30 "/boot/System.map", > vmdump=0x120164f30 "/dev/mem", namelist=0x120164d30 "", > flags=0) at klib.c:220 > #3 0x1200029b0 in main (argc=1, argv=0x11ffffc28) at main.c:182 > > > > From klib.c, allock_klib: > > static klib_t * > alloc_klib() > { > klib_t *klp; > > /* if (klp = (klib_t *)malloc(sizeof(klib_t))) { */ > bzero(klp, sizeof(klib_t)); > /* }*/ > return(klp); > } > > Does bzero work correctly on Alpha? Or are my arguments invalid? > > > -- > http://www.bigfoot.com/~brihall > Linux Consultant > > ********************************************************** > To unsubscribe from this list, send mail to > majordomo@zk3.dec.com with the following text in the > *body* (*not* the subject line) of the letter: > unsubscribe clinux > ********************************************************** > > ********************************************************** > To unsubscribe from this list, send mail to > majordomo@zk3.dec.com with the following text in the > *body* (*not* the subject line) of the letter: > unsubscribe clinux > ********************************************************** -- Jerry Feldman Contractor, eInfrastructure Partner Engineering 508-467-4315 http://www.testdrive.compaq.com/linux/ Compaq Computer Corp. 200 Forest Street MRO1-3/F1 Marlboro, Ma. 01752 From owner-lkcd@oss.sgi.com Wed Apr 19 14:18:24 2000 Received: by oss.sgi.com id ; Wed, 19 Apr 2000 14:18:15 -0700 Received: from ztxmail04.ztx.compaq.com ([161.114.1.208]:56327 "HELO ztxmail04.ztx.compaq.com") by oss.sgi.com with SMTP id ; Wed, 19 Apr 2000 14:17:55 -0700 Received: by ztxmail04.ztx.compaq.com (Postfix, from userid 12345) id 1D945289; Wed, 19 Apr 2000 16:17:46 -0500 (CDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by ztxmail04.ztx.compaq.com (Postfix) with SMTP id A39A52D3 for ; Wed, 19 Apr 2000 16:17:45 -0500 (CDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA16280; Wed, 19 Apr 2000 15:17:45 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA25866; Wed, 19 Apr 2000 15:17:44 -0600 Received: by compaq.com (sSMTP sendmail emulation); Wed, 19 Apr 2000 15:13:20 -0600 Content-Length: 1346 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Wed, 19 Apr 2000 15:13:20 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: alloc problems with Alpha lcrash Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I'm trying to get lcrash to work on Alpha, and I'm having some trouble with it segfaulting. It appears that the page allocation is failing, and I'm not sure exactly why. I do get a lot of warnings in alloc.c ("warning: cast from pointer to integer of different size"), but since this is for hash functions I'm not sure that matters (?) gdb traceback: (gdb) where #0 0x12001b110 in enqueue (list=0x1202fc188, new=0x20300000) at alloc.c:56 #1 0x12001c130 in get_page (index=4) at alloc.c:435 #2 0x12001caac in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:692 #3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at util.c:279 #4 0x12002be4c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at kl_alloc.c:22 #5 0x120003d6c in register_cmds (cmds=0x120144a70) at command.c:17 #6 0x120002b20 in main (argc=3, argv=0x11ffffba8) at main.c:200 When starting lcrash, I do get the "Please wait..." before the segfault. I am running my lcrash executable with "./lcrash /var/log/vmdump/map.0 /var/log/vmdump/vmdump.0", where the files were generated by a previous crash. Is it OK to test this way? Crashing the system and rebooting takes about 20 minutes. I am assuming the vmdump.0 is valid data, it is about the right size, and the header info is correct. map.0 looks good. -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Thu Apr 20 12:59:41 2000 Received: by oss.sgi.com id ; Thu, 20 Apr 2000 12:59:22 -0700 Received: from ztxmail04.ztx.compaq.com ([161.114.1.208]:12301 "HELO ztxmail04.ztx.compaq.com") by oss.sgi.com with SMTP id ; Thu, 20 Apr 2000 12:58:51 -0700 Received: by ztxmail04.ztx.compaq.com (Postfix, from userid 12345) id 10D211B8; Thu, 20 Apr 2000 14:58:45 -0500 (CDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by ztxmail04.ztx.compaq.com (Postfix) with SMTP id 98A65231 for ; Thu, 20 Apr 2000 14:58:44 -0500 (CDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA21051; Thu, 20 Apr 2000 13:58:43 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA28440; Thu, 20 Apr 2000 13:58:43 -0600 Received: by compaq.com (sSMTP sendmail emulation); Thu, 20 Apr 2000 13:54:13 -0600 Content-Length: 1346 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Thu, 20 Apr 2000 13:54:12 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: alloc problems with Alpha lcrash Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I'm trying to get lcrash to work on Alpha, and I'm having some trouble with it segfaulting. It appears that the page allocation is failing, and I'm not sure exactly why. I do get a lot of warnings in alloc.c ("warning: cast from pointer to integer of different size"), but since this is for hash functions I'm not sure that matters (?) gdb traceback: (gdb) where #0 0x12001b110 in enqueue (list=0x1202fc188, new=0x20300000) at alloc.c:56 #1 0x12001c130 in get_page (index=4) at alloc.c:435 #2 0x12001caac in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:692 #3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at util.c:279 #4 0x12002be4c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at kl_alloc.c:22 #5 0x120003d6c in register_cmds (cmds=0x120144a70) at command.c:17 #6 0x120002b20 in main (argc=3, argv=0x11ffffba8) at main.c:200 When starting lcrash, I do get the "Please wait..." before the segfault. I am running my lcrash executable with "./lcrash /var/log/vmdump/map.0 /var/log/vmdump/vmdump.0", where the files were generated by a previous crash. Is it OK to test this way? Crashing the system and rebooting takes about 20 minutes. I am assuming the vmdump.0 is valid data, it is about the right size, and the header info is correct. map.0 looks good. -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Tue Apr 25 01:04:05 2000 Received: by oss.sgi.com id ; Tue, 25 Apr 2000 01:03:55 -0700 Received: from diplodocus.quadrics.com ([194.202.174.2]:29946 "EHLO diplodocus.quadrics.com") by oss.sgi.com with ESMTP id ; Tue, 25 Apr 2000 01:03:38 -0700 Received: from quadrics.com (positron [194.202.174.10]) by diplodocus.quadrics.com (8.9.3/8.9.3) with ESMTP id JAA09176; Tue, 25 Apr 2000 09:03:14 +0100 (BST) Message-ID: <39055145.4669786@quadrics.com> Date: Tue, 25 Apr 2000 09:03:17 +0100 From: David Addison Organization: Quadrics Supercomputers World X-Mailer: Mozilla 4.72 [en] (X11; I; SunOS 5.6 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: Brian Hall CC: lkcd@oss.sgi.com, axp-list@redhat.com, clinux@zk3.dec.com Subject: Re: Alpha lkcd port: bzero problem References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Brian, I have been using the kernel coredump and crash utilities provided under GPL by http://www.missioncriticallinux.com/ it too is based on gdb and it works pretty well and can run against live kernels or dumps. I think it also works against lkcd produced dump files (x86 ??). It seems a shame to duplicate effort on the Alpha and also to have multiple crash analysis tools. Hopefully Mission Critical Linux and SGI can work together on this ? Yours, Addy. Brian Hall wrote: > I have made my attempt at integrating the Alpha debugger from gdb-4.1.8 into > lcrash. However, now when I run lcrash (directly at the command line or via a > real crash), I get this in gdb: > From owner-lkcd@oss.sgi.com Wed Apr 26 08:28:16 2000 Received: by oss.sgi.com id ; Wed, 26 Apr 2000 08:28:07 -0700 Received: from zmamail04.zma.compaq.com ([161.114.64.104]:3855 "HELO zmamail04.zma.compaq.com") by oss.sgi.com with SMTP id ; Wed, 26 Apr 2000 08:27:55 -0700 Received: by zmamail04.zma.compaq.com (Postfix, from userid 12345) id 8D655502; Wed, 26 Apr 2000 11:27:49 -0400 (EDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by zmamail04.zma.compaq.com (Postfix) with SMTP id E58837F7; Wed, 26 Apr 2000 11:27:48 -0400 (EDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA09453; Wed, 26 Apr 2000 09:27:48 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA01100; Wed, 26 Apr 2000 09:27:47 -0600 Received: by compaq.com (sSMTP sendmail emulation); Wed, 26 Apr 2000 09:22:38 -0600 Content-Length: 416 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Wed, 26 Apr 2000 09:22:38 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: "Matt D. Robinson" , lkcd@oss.sgi.com Subject: lkcd status? Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing There hasn't been much traffic on the LKCD list lately, and it has been awhile since the last new release. I'm still working on getting lcrash working on Alpha, but I saw no replies to my last couple of questions. Is the project not being worked on as much now, or...? I understand that most (if not all) of the developers working on this also have other tasks. -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Wed Apr 26 09:23:36 2000 Received: by oss.sgi.com id ; Wed, 26 Apr 2000 09:23:27 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:49495 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 26 Apr 2000 09:23:05 -0700 Received: from nodin.corp.sgi.com (nodin.corp.sgi.com [192.26.51.193]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id JAA15193 for ; Wed, 26 Apr 2000 09:18:19 -0700 (PDT) mail_from (tjm@sgi.com) Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by nodin.corp.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id JAA90524 for ; Wed, 26 Apr 2000 09:22:34 -0700 (PDT) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id JAA60413; Wed, 26 Apr 2000 09:19:57 -0700 (PDT) Message-ID: <3907172C.3C7CC593@sgi.com> Date: Wed, 26 Apr 2000 09:19:56 -0700 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Brian Hall CC: "Matt D. Robinson" , lkcd@oss.sgi.com Subject: Re: lkcd status? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Brian Hall wrote: > > There hasn't been much traffic on the LKCD list lately, and it has been awhile > since the last new release. I'm still working on getting lcrash working on > Alpha, but I saw no replies to my last couple of questions. Is the project not > being worked on as much now, or...? I understand that most (if not all) of the > developers working on this also have other tasks. Hi Brian, I have been out of town (and off line) for the last week and a half. I'll have to go back and see what questions you refer to. As for current status, Matt and I ARE a bit behind in posting a status message. We have made a number of changes and are preparing a new patch (for 2.2.14). We have been spending most of our time (Matt in the kernel and myself with lcrash) porting LKCD to 2.3.x. I know that from an lcrash point of view, you can grab the latest source (for 2.2 and 2.3) from SourceForge (project lkcd). With the 2.3 version of lcrash, you can run lcrash against a live system and can now generate a compressed dump from live system memory -- without crashing the system (I'm going to back port this feature to 2.2). Matt can provide status on where he stands with regard to the kernel porting effort. Pushing the source to SourceForge is part of our efforts to make the project more open for participation by "outside" developers (which is good, because Matt falls into that category now). :] Yes, we all have other stuff we need to work on. I can assure you though, that this project is a very high priority for myself and is factored into the supportability of SGI's Linux offering (again, Matt can speak for himself). As for the work you are doing with respect to an Alpha port, I think that it's very important for this functionality to be supported on a wide range of architectures. What you are doing is very important to this effort. If there is any help that I can provide, please let me know (I'm sure Matt feels the same). Thanks, Tom From owner-lkcd@oss.sgi.com Wed Apr 26 11:06:57 2000 Received: by oss.sgi.com id ; Wed, 26 Apr 2000 11:06:38 -0700 Received: from ztxmail03.ztx.compaq.com ([161.114.1.207]:20755 "HELO ztxmail03.ztx.compaq.com") by oss.sgi.com with SMTP id ; Wed, 26 Apr 2000 11:06:20 -0700 Received: by ztxmail03.ztx.compaq.com (Postfix, from userid 12345) id 04CFB27B; Wed, 26 Apr 2000 13:06:14 -0500 (CDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by ztxmail03.ztx.compaq.com (Postfix) with SMTP id 946396E for ; Wed, 26 Apr 2000 13:06:13 -0500 (CDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA09679; Wed, 26 Apr 2000 12:06:12 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA01953; Wed, 26 Apr 2000 12:06:07 -0600 Received: by compaq.com (sSMTP sendmail emulation); Wed, 26 Apr 2000 12:00:58 -0600 Content-Length: 1078 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Wed, 26 Apr 2000 12:00:57 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: dump header memory size conflict? Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I am trying to debug Alpha lcrash, and I noticed a possible problem in my header data. Shouldn't this always be true of the dump_header structure data: (dh_memory_size == dh_page_size * dh_num_pages) Using a modified version of a program to dump the header contents, I get: Dump Header (version 2): Magic number: 0xa8190173618f23ed PAGE_SIZE = 8192 Dump header size: 728 Dump level: 4 Panic string: User created crash dump Uname info: Linux dhcp96-180.cxo.dec.com 2.2.13 #10 Tue Feb 8 16:06:15 MST 2000 alpha gldulab Address of current task: fffffc0004e28000 Physical memory: (all sizes in bytes) Start: 0xfffffc0000000000 End: 0xfffffc0006000000 Size: 0x5af2000 Number of pages in dump: 12288 Total memory used by pages: 0x6000000 Time of dump: Tue Apr 18 15:08:24 2000 Note this shows total memory used by pages < memory size ?!? I am wondering if this could be why lcrash keeps dying because it can't access some of the memory. This machine is an Alpha AS200 with 96MB RAM. -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Wed Apr 26 11:20:28 2000 Received: by oss.sgi.com id ; Wed, 26 Apr 2000 11:20:09 -0700 Received: from mail.turbolinux.com ([38.170.88.25]:37637 "EHLO mail.turbolinux.com") by oss.sgi.com with ESMTP id ; Wed, 26 Apr 2000 11:20:03 -0700 Received: from localhost (yakker@localhost) by mail.turbolinux.com (8.9.3/8.9.3) with ESMTP id LAA27439; Wed, 26 Apr 2000 11:19:51 -0700 Date: Wed, 26 Apr 2000 11:19:51 -0700 (PDT) From: "Matt D. Robinson" To: Brian Hall cc: lkcd@oss.sgi.com Subject: Re: lkcd status? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Wed, 26 Apr 2000, Brian Hall wrote: |>There hasn't been much traffic on the LKCD list lately, and it has been awhile |>since the last new release. I'm still working on getting lcrash working on |>Alpha, but I saw no replies to my last couple of questions. Is the project not |>being worked on as much now, or...? I understand that most (if not all) of the |>developers working on this also have other tasks. It's still under works, Brian. In fact, Tom and I are meeting tomorrow to test out some of the 2.3 code (I don't work for SGI anymore, I'm over at TurboLinux, which has probably led to some delays, as I'm getting ramped up on new products). I didn't see the questions, as my mailing address wasn't correct on oss.sgi.com (but is now, as of yesterday, when Nancy let me know what was going on). In any case, if you'll forward your questions to me, I'll be happy to look into the issues with you. --Matt From owner-lkcd@oss.sgi.com Wed Apr 26 13:37:29 2000 Received: by oss.sgi.com id ; Wed, 26 Apr 2000 13:37:19 -0700 Received: from ztxmail05.ztx.compaq.com ([161.114.1.209]:5896 "HELO ztxmail05.ztx.compaq.com") by oss.sgi.com with SMTP id ; Wed, 26 Apr 2000 13:37:11 -0700 Received: by ztxmail05.ztx.compaq.com (Postfix, from userid 12345) id D70FDA2A; Wed, 26 Apr 2000 15:37:00 -0500 (CDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by ztxmail05.ztx.compaq.com (Postfix) with SMTP id 3327C8AE; Wed, 26 Apr 2000 15:37:00 -0500 (CDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA10813; Wed, 26 Apr 2000 14:36:59 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA13566; Wed, 26 Apr 2000 14:36:49 -0600 Received: by compaq.com (sSMTP sendmail emulation); Wed, 26 Apr 2000 14:31:38 -0600 Content-Length: 2264 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Wed, 26 Apr 2000 14:31:38 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: Alpha lcrash initialization problem - can't access memory Cc: "Matt D. Robinson" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing OK, Alpha lcrash is dying before it gets very far. Any ideas why it can't access the memory in question? I can see where not being able to access memory being asked for causes a segfault, but why the report about "alloc.c: No such file or directory" ? [root@dhcp96-180 lcrash]# gdb ./lcrash (gdb) run map.0 vmdump.0 -d 1 Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash map.0 vmdump.0 -d 1 map = map.0, vmdump = vmdump.0, outfile = stdout Please wait................... Program received signal SIGSEGV, Segmentation fault. 0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 57 alloc.c: No such file or directory. (gdb) where full #0 0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 head = (element_t *) 0x0 #1 0x12001c154 in get_page (index=4) at alloc.c:438 i = 0 b = (block_t *) 0x20300000 page = (void *) 0x12001c624 p = (page_t *) 0x1202fc1a0 #2 0x12001cacc in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:695 i = 4 j = 1 blk = (void *) 0xfffffffff7f7ffdb p = (page_t *) 0x11ffffad0 b = (block_t *) 0x0 #3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at util.c:279 b = (void *) 0x12002be14 #4 0x12002be6c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at kl_alloc.c:22 blk = (void *) 0x120003d4c #5 0x120003d6c in register_cmds (cmds=0x120144aa8) at command.c:17 i = 0 ret = 1 max_depth = 539896352 cmd_rec = (cmd_rec_t *) 0x0 #6 0x120002b20 in main (argc=5, argv=0x11ffffbb8) at main.c:200 i = 5 c = 512 errflg = 0 (gdb) p *list $1 = (element_t *) 0x0 (gdb) p head $2 = (element_t *) 0x0 (gdb) p new $3 = (element_t *) 0x20300000 (gdb) p (head = *list) $4 = (element_t *) 0x0 (gdb) p new $5 = (element_t *) 0x20300000 (gdb) p new->next Cannot access memory at address 0x20300000. (gdb) p new->prev Cannot access memory at address 0x20300008. (gdb) p *list = new $6 = (element_t *) 0x20300000 (gdb) p new->next Cannot access memory at address 0x20300000. (gdb) p new->prev Cannot access memory at address 0x20300008. -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Wed Apr 26 15:14:39 2000 Received: by oss.sgi.com id ; Wed, 26 Apr 2000 15:14:20 -0700 Received: from mail.turbolinux.com ([38.170.88.25]:20493 "EHLO mail.turbolinux.com") by oss.sgi.com with ESMTP id ; Wed, 26 Apr 2000 15:13:53 -0700 Received: from localhost (yakker@localhost) by mail.turbolinux.com (8.9.3/8.9.3) with ESMTP id PAA07138; Wed, 26 Apr 2000 15:13:41 -0700 Date: Wed, 26 Apr 2000 15:13:39 -0700 (PDT) From: "Matt D. Robinson" To: Brian Hall cc: lkcd@oss.sgi.com Subject: Re: Alpha lcrash initialization problem - can't access memory In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Wed, 26 Apr 2000, Brian Hall wrote: |>OK, Alpha lcrash is dying before it gets very far. Any ideas why it can't |>access the memory in question? |> |>I can see where not being able to access memory being asked for causes a |>segfault, but why the report about "alloc.c: No such file or directory" ? This is probably due to 'gdb's understanding of where alloc.c is found. Looking at this stack trace, have you removed any commands as of late that wouldn't have been used? Looks like something may be wrong with the commands structure. Can you dump out the table? --Matt |>[root@dhcp96-180 lcrash]# gdb ./lcrash |> |>(gdb) run map.0 vmdump.0 -d 1 |>Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash |>map.0 vmdump.0 -d 1 |>map = map.0, vmdump = vmdump.0, outfile = stdout |> |>Please wait................... |> |>Program received signal SIGSEGV, Segmentation fault. |>0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 |>57 alloc.c: No such file or directory. |>(gdb) where full |>#0 0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 |> head = (element_t *) 0x0 |>#1 0x12001c154 in get_page (index=4) at alloc.c:438 |> i = 0 |> b = (block_t *) 0x20300000 |> page = (void *) 0x12001c624 |> p = (page_t *) 0x1202fc1a0 |>#2 0x12001cacc in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:695 |> i = 4 |> j = 1 |> blk = (void *) 0xfffffffff7f7ffdb |> p = (page_t *) 0x11ffffad0 |> b = (block_t *) 0x0 |>#3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at util.c:279 |> b = (void *) 0x12002be14 |>#4 0x12002be6c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at kl_alloc.c:22 |> blk = (void *) 0x120003d4c |>#5 0x120003d6c in register_cmds (cmds=0x120144aa8) at command.c:17 |> i = 0 |> ret = 1 |> max_depth = 539896352 |> cmd_rec = (cmd_rec_t *) 0x0 |>#6 0x120002b20 in main (argc=5, argv=0x11ffffbb8) at main.c:200 |> i = 5 |> c = 512 |> errflg = 0 |>(gdb) p *list |>$1 = (element_t *) 0x0 |>(gdb) p head |>$2 = (element_t *) 0x0 |>(gdb) p new |>$3 = (element_t *) 0x20300000 |>(gdb) p (head = *list) |>$4 = (element_t *) 0x0 |>(gdb) p new |>$5 = (element_t *) 0x20300000 |>(gdb) p new->next |>Cannot access memory at address 0x20300000. |>(gdb) p new->prev |>Cannot access memory at address 0x20300008. |>(gdb) p *list = new |>$6 = (element_t *) 0x20300000 |>(gdb) p new->next |>Cannot access memory at address 0x20300000. |>(gdb) p new->prev |>Cannot access memory at address 0x20300008. |> |>-- |>http://www.bigfoot.com/~brihall |>Linux Consultant |> From owner-lkcd@oss.sgi.com Thu Apr 27 07:16:49 2000 Received: by oss.sgi.com id ; Thu, 27 Apr 2000 07:16:39 -0700 Received: from ztxmail03.ztx.compaq.com ([161.114.1.207]:53001 "HELO ztxmail03.ztx.compaq.com") by oss.sgi.com with SMTP id ; Thu, 27 Apr 2000 07:16:15 -0700 Received: by ztxmail03.ztx.compaq.com (Postfix, from userid 12345) id 868C72A0; Thu, 27 Apr 2000 09:16:09 -0500 (CDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by ztxmail03.ztx.compaq.com (Postfix) with SMTP id AC612150; Thu, 27 Apr 2000 09:16:08 -0500 (CDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA13690; Thu, 27 Apr 2000 08:16:07 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA08869; Thu, 27 Apr 2000 08:16:07 -0600 Received: by compaq.com (sSMTP sendmail emulation); Thu, 27 Apr 2000 08:10:52 -0600 Content-Length: 11476 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_=XFMail.1.4.4.Linux:20000427081051:8960=_" In-Reply-To: Date: Thu, 27 Apr 2000 08:10:51 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: "Matt D. Robinson" Subject: Re: Alpha lcrash initialization problem - can't access memory Cc: lkcd@oss.sgi.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing This message is in MIME format --_=XFMail.1.4.4.Linux:20000427081051:8960=_ Content-Type: text/plain; charset=us-ascii Dump of cmdset[] is attached. Appears to match the cmdset table in cmds.c exactly. I haven't altered the list of commands in cmds.c at all. I suspect the problem is I have missed something in replacing the i386 stuff with Alpha functions. I haven't changed all the function names that start with "i386", etc but that is just cosmetic. On 26-Apr-2000 Matt D. Robinson wrote: > On Wed, 26 Apr 2000, Brian Hall wrote: >|>OK, Alpha lcrash is dying before it gets very far. Any ideas why it can't >|>access the memory in question? >|> >|>I can see where not being able to access memory being asked for causes a >|>segfault, but why the report about "alloc.c: No such file or directory" ? > > This is probably due to 'gdb's understanding of where alloc.c is > found. > > Looking at this stack trace, have you removed any commands as of late > that wouldn't have been used? Looks like something may be wrong with > the commands structure. Can you dump out the table? > > --Matt > >|>[root@dhcp96-180 lcrash]# gdb ./lcrash >|> >|>(gdb) run map.0 vmdump.0 -d 1 >|>Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash >|>map.0 vmdump.0 -d 1 >|>map = map.0, vmdump = vmdump.0, outfile = stdout >|> >|>Please wait................... >|> >|>Program received signal SIGSEGV, Segmentation fault. >|>0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 >|>57 alloc.c: No such file or directory. >|>(gdb) where full >|>#0 0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 >|> head = (element_t *) 0x0 >|>#1 0x12001c154 in get_page (index=4) at alloc.c:438 >|> i = 0 >|> b = (block_t *) 0x20300000 >|> page = (void *) 0x12001c624 >|> p = (page_t *) 0x1202fc1a0 >|>#2 0x12001cacc in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:695 >|> i = 4 >|> j = 1 >|> blk = (void *) 0xfffffffff7f7ffdb >|> p = (page_t *) 0x11ffffad0 >|> b = (block_t *) 0x0 >|>#3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at >|>#util.c:279 >|> b = (void *) 0x12002be14 >|>#4 0x12002be6c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at >|>#kl_alloc.c:22 >|> blk = (void *) 0x120003d4c >|>#5 0x120003d6c in register_cmds (cmds=0x120144aa8) at command.c:17 >|> i = 0 >|> ret = 1 >|> max_depth = 539896352 >|> cmd_rec = (cmd_rec_t *) 0x0 >|>#6 0x120002b20 in main (argc=5, argv=0x11ffffbb8) at main.c:200 >|> i = 5 >|> c = 512 >|> errflg = 0 >|>(gdb) p *list >|>$1 = (element_t *) 0x0 >|>(gdb) p head >|>$2 = (element_t *) 0x0 >|>(gdb) p new >|>$3 = (element_t *) 0x20300000 >|>(gdb) p (head = *list) >|>$4 = (element_t *) 0x0 >|>(gdb) p new >|>$5 = (element_t *) 0x20300000 >|>(gdb) p new->next >|>Cannot access memory at address 0x20300000. >|>(gdb) p new->prev >|>Cannot access memory at address 0x20300008. >|>(gdb) p *list = new >|>$6 = (element_t *) 0x20300000 >|>(gdb) p new->next >|>Cannot access memory at address 0x20300000. >|>(gdb) p new->prev >|>Cannot access memory at address 0x20300008. >|> >|>-- >|>http://www.bigfoot.com/~brihall >|>Linux Consultant >|> -- http://www.bigfoot.com/~brihall Linux Consultant --_=XFMail.1.4.4.Linux:20000427081051:8960=_ Content-Disposition: attachment; filename="cmd.table" Content-Transfer-Encoding: quoted-printable Content-Description: cmd.table Content-Type: application/octet-stream; name=cmd.table; SizeOnDisk=7057 $10 =3D {cmd =3D 0x120035e26 "addtypes", real_cmd =3D 0x0, cmdfunc =3D 0x12= 00060a0 , cmdparse =3D 0x120006240 , cmdhelp =3D 0x1200061a0 , cmdusage =3D 0x120006140 } $11 =3D {cmd =3D 0x120035e21 "base", real_cmd =3D 0x0, cmdfunc =3D 0x120005= de0 , cmdparse =3D 0x120006020 , cmdhelp =3D 0x120005f8= 0 , cmdusage =3D 0x120005f20 } $12 =3D {cmd =3D 0x120035e19 "deftask", real_cmd =3D 0x0, cmdfunc =3D 0x120= 0062c0 , cmdparse =3D 0x120006560 , cmdhelp =3D 0x12= 00064c0 , cmdusage =3D 0x120006460 } $13 =3D {cmd =3D 0x120035e16 "dt", real_cmd =3D 0x120035e19 "deftask", cmdf= unc =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $14 =3D {cmd =3D 0x120035e12 "dis", real_cmd =3D 0x0, cmdfunc =3D 0x1200065= e0 , cmdparse =3D 0x120006e60 , cmdhelp =3D 0x120006dc0 , cmdusage =3D 0x120006d60 } $15 =3D {cmd =3D 0x120035e0f "id", real_cmd =3D 0x120035e12 "dis", cmdfunc = =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $16 =3D {cmd =3D 0x120035e0a "dump", real_cmd =3D 0x0, cmdfunc =3D 0x120006= f20 , cmdparse =3D 0x120007240 , cmdhelp =3D 0x1200071a= 0 , cmdusage =3D 0x120007140 } $17 =3D {cmd =3D 0x120035e07 "od", real_cmd =3D 0x120035e0a "dump", cmdfunc= =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $18 =3D {cmd =3D 0x120035e04 "md", real_cmd =3D 0x120035e0a "dump", cmdfunc= =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $19 =3D {cmd =3D 0x120035dfc "findsym", real_cmd =3D 0x0, cmdfunc =3D 0x120= 0073e0 , cmdparse =3D 0x120007760 , cmdhelp =3D 0x12= 00076c0 , cmdusage =3D 0x120007660 } $20 =3D {cmd =3D 0x120035df7 "fsym", real_cmd =3D 0x120035dfc "findsym", cm= dfunc =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $21 =3D {cmd =3D 0x120035df2 "help", real_cmd =3D 0x0, cmdfunc =3D 0x120007= d00 , cmdparse =3D 0x120007f60 , cmdhelp =3D 0x120007ec= 0 , cmdusage =3D 0x120007e60 } $22 =3D {cmd =3D 0x120035df0 "?", real_cmd =3D 0x120035df2 "help", cmdfunc = =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $23 =3D {cmd =3D 0x120035de8 "history", real_cmd =3D 0x0, cmdfunc =3D 0, cm= dparse =3D 0, cmdhelp =3D 0x12000be80 , cmdusage =3D 0x12000be20 } $24 =3D {cmd =3D 0x120035de6 "h", real_cmd =3D 0x120035de8 "history", cmdfu= nc =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $25 =3D {cmd =3D 0x120035dde "mktrace", real_cmd =3D 0x0, cmdfunc =3D 0x120= 007fe0 , cmdparse =3D 0x120008ae0 , cmdhelp =3D 0x12= 0008a40 , cmdusage =3D 0x1200089e0 } $26 =3D {cmd =3D 0x120035ddb "mt", real_cmd =3D 0x120035dde "mktrace", cmdf= unc =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $27 =3D {cmd =3D 0x120035dd6 "mmap", real_cmd =3D 0x0, cmdfunc =3D 0x120008= bc0 , cmdparse =3D 0x120008f40 , cmdhelp =3D 0x120008ea= 0 , cmdusage =3D 0x120008e40 } $28 =3D {cmd =3D 0x120035dcd "namelist", real_cmd =3D 0x0, cmdfunc =3D 0x12= 000d140 , cmdparse =3D 0x12000d420 , cmdhelp =3D 0x12000d380 , cmdusage =3D 0x12000d320 } = =20 $29 =3D {cmd =3D 0x120035dc6 "nmlist", real_cmd =3D 0x120035dcd "namelist",= cmdfunc =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $30 =3D {cmd =3D 0x120035dc1 "page", real_cmd =3D 0x0, cmdfunc =3D 0x120008= fc0 , cmdparse =3D 0x120009520 , cmdhelp =3D 0x12000948= 0 , cmdusage =3D 0x120009420 } $31 =3D {cmd =3D 0x120035dbc "quit", real_cmd =3D 0x0, cmdfunc =3D 0x120009= 5a0 , cmdparse =3D 0x1200097e0 , cmdhelp =3D 0x12000974= 0 , cmdusage =3D 0x1200096e0 } $32 =3D {cmd =3D 0x120035dba "q", real_cmd =3D 0x120035dbc "quit", cmdfunc = =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $33 =3D {cmd =3D 0x120035db7 "q!", real_cmd =3D 0x120035dbc "quit", cmdfunc= =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $34 =3D {cmd =3D 0x120035db0 "report", real_cmd =3D 0x0, cmdfunc =3D 0x1200= 0bf20 , cmdparse =3D 0x12000c080 , cmdhelp =3D 0x1200= 0bfe0 , cmdusage =3D 0x12000bf80 } $35 =3D {cmd =3D 0x120035dab "stat", real_cmd =3D 0x0, cmdfunc =3D 0x120009= d00 , cmdparse =3D 0x120009fa0 , cmdhelp =3D 0x120009f0= 0 , cmdusage =3D 0x120009ea0 } $36 =3D {cmd =3D 0x120035da4 "strace", real_cmd =3D 0x0, cmdfunc =3D 0x1200= 0a020 , cmdparse =3D 0x12000a5a0 , cmdhelp =3D 0x1200= 0a500 , cmdusage =3D 0x12000a4a0 } $37 =3D {cmd =3D 0x120035d9d "symbol", real_cmd =3D 0x0, cmdfunc =3D 0x1200= 0a620 , cmdparse =3D 0x12000aa00 , cmdhelp =3D 0x1200= 0a960 , cmdusage =3D 0x12000a900 } $38 =3D {cmd =3D 0x120035d99 "sym", real_cmd =3D 0x120035d9d "symbol", cmdf= unc =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $39 =3D {cmd =3D 0x120035d94 "task", real_cmd =3D 0x0, cmdfunc =3D 0x12000a= a80 , cmdparse =3D 0x12000aea0 , cmdhelp =3D 0x12000ae0= 0 , cmdusage =3D 0x12000ada0 } $40 =3D {cmd =3D 0x120035d8e "trace", real_cmd =3D 0x0, cmdfunc =3D 0x12000= af20 , cmdparse =3D 0x12000b740 , cmdhelp =3D 0x12000b= 6a0 , cmdusage =3D 0x12000b640 } $41 =3D {cmd =3D 0x120035d8c "t", real_cmd =3D 0x120035d8e "trace", cmdfunc= =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $42 =3D {cmd =3D 0x120035d89 "bt", real_cmd =3D 0x120035d8e "trace", cmdfun= c =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $43 =3D {cmd =3D 0x120035d86 "ps", real_cmd =3D 0x120035d94 "task", cmdfunc= =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $44 =3D {cmd =3D 0x120035d80 "ptype", real_cmd =3D 0x0, cmdfunc =3D 0x12000= c100 , cmdparse =3D 0x12000c4c0 , cmdhelp =3D 0x12000c= 420 , cmdusage =3D 0x12000c3c0 } $45 =3D {cmd =3D 0x120035d7e "p", real_cmd =3D 0x120035d80 "ptype", cmdfunc= =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $46 =3D {cmd =3D 0x120035d7b "px", real_cmd =3D 0x120035d80 "ptype", cmdfun= c =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $47 =3D {cmd =3D 0x120035d78 "po", real_cmd =3D 0x120035d80 "ptype", cmdfun= c =3D 0, cmdparse =3D 0, cmdhelp =3D 0, cmdusage =3D 0} $48 =3D {cmd =3D 0x120035d71 "sizeof", real_cmd =3D 0x0, cmdfunc =3D 0x1200= 0c540 , cmdparse =3D 0x12000c8a0 , cmdhelp =3D 0x1200= 0c800 , cmdusage =3D 0x12000c7a0 } $49 =3D {cmd =3D 0x120035d6c "vtop", real_cmd =3D 0x0, cmdfunc =3D 0x12000b= 960 , cmdparse =3D 0x12000bd20 , cmdhelp =3D 0x12000bc8= 0 , cmdusage =3D 0x12000bc20 } $50 =3D {cmd =3D 0x120035d67 "walk", real_cmd =3D 0x0, cmdfunc =3D 0x12000d= 4a0 , cmdparse =3D 0x12000da60 , cmdhelp =3D 0x12000d9c= 0 , cmdusage =3D 0x12000d960 } $51 =3D {cmd =3D 0x120035d60 "whatis", real_cmd =3D 0x0, cmdfunc =3D 0x1200= 0c920 , cmdparse =3D 0x12000d0c0 , cmdhelp =3D 0x1200= 0d020 , cmdusage =3D 0x12000cfc0 } $52 =3D {cmd =3D 0x0, real_cmd =3D 0x0, cmdfunc =3D 0, cmdparse =3D 0, cmdh= elp =3D 0, cmdusage =3D 0} =20 --_=XFMail.1.4.4.Linux:20000427081051:8960=_-- End of MIME message From owner-lkcd@oss.sgi.com Thu Apr 27 11:07:32 2000 Received: by oss.sgi.com id ; Thu, 27 Apr 2000 11:07:12 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:54290 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Thu, 27 Apr 2000 11:07:06 -0700 Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id LAA10452 for ; Thu, 27 Apr 2000 11:02:20 -0700 (PDT) mail_from (tjm@sgi.com) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id LAA62090; Thu, 27 Apr 2000 11:05:34 -0700 (PDT) Message-ID: <3908816E.42B3B16B@sgi.com> Date: Thu, 27 Apr 2000 11:05:34 -0700 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Brian Hall CC: "Matt D. Robinson" , lkcd@oss.sgi.com Subject: Re: Alpha lcrash initialization problem - can't access memory References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi Brian, >From the location of the failure (the SEGV), it sounds as if the block alligator might not have been initialized properly. The alloc.c module you refer to is actually lib/liballoc/alloc.c. It's a local block alligator that allows us to track blocks that might need to be freed in the event of a longjmp call (because someone hit Ctrl-C during command output). Make sure you are calling init_liballoc() during your initialization in the main() function. Also make sure you are passing the correct parameters to the function (the parameter list changed a while back). The other possibility is that you are walking off the end of a memory block and trashing memory behind it. I say this because, at the point where you blow up, you are dealing with the liballoc control structure memory (which contains a bad pointer). Let me know what you find, Tom Brian Hall wrote: > > Dump of cmdset[] is attached. Appears to match the cmdset table in cmds.c > exactly. > > I haven't altered the list of commands in cmds.c at all. I suspect the problem > is I have missed something in replacing the i386 stuff with Alpha functions. I > haven't changed all the function names that start with "i386", etc but that is > just cosmetic. > > On 26-Apr-2000 Matt D. Robinson wrote: > > On Wed, 26 Apr 2000, Brian Hall wrote: > >|>OK, Alpha lcrash is dying before it gets very far. Any ideas why it can't > >|>access the memory in question? > >|> > >|>I can see where not being able to access memory being asked for causes a > >|>segfault, but why the report about "alloc.c: No such file or directory" ? > > > > This is probably due to 'gdb's understanding of where alloc.c is > > found. > > > > Looking at this stack trace, have you removed any commands as of late > > that wouldn't have been used? Looks like something may be wrong with > > the commands structure. Can you dump out the table? > > > > --Matt > > > >|>[root@dhcp96-180 lcrash]# gdb ./lcrash > >|> > >|>(gdb) run map.0 vmdump.0 -d 1 > >|>Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash > >|>map.0 vmdump.0 -d 1 > >|>map = map.0, vmdump = vmdump.0, outfile = stdout > >|> > >|>Please wait................... > >|> > >|>Program received signal SIGSEGV, Segmentation fault. > >|>0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 > >|>57 alloc.c: No such file or directory. > >|>(gdb) where full > >|>#0 0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 > >|> head = (element_t *) 0x0 > >|>#1 0x12001c154 in get_page (index=4) at alloc.c:438 > >|> i = 0 > >|> b = (block_t *) 0x20300000 > >|> page = (void *) 0x12001c624 > >|> p = (page_t *) 0x1202fc1a0 > >|>#2 0x12001cacc in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:695 > >|> i = 4 > >|> j = 1 > >|> blk = (void *) 0xfffffffff7f7ffdb > >|> p = (page_t *) 0x11ffffad0 > >|> b = (block_t *) 0x0 > >|>#3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at > >|>#util.c:279 > >|> b = (void *) 0x12002be14 > >|>#4 0x12002be6c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at > >|>#kl_alloc.c:22 > >|> blk = (void *) 0x120003d4c > >|>#5 0x120003d6c in register_cmds (cmds=0x120144aa8) at command.c:17 > >|> i = 0 > >|> ret = 1 > >|> max_depth = 539896352 > >|> cmd_rec = (cmd_rec_t *) 0x0 > >|>#6 0x120002b20 in main (argc=5, argv=0x11ffffbb8) at main.c:200 > >|> i = 5 > >|> c = 512 > >|> errflg = 0 > >|>(gdb) p *list > >|>$1 = (element_t *) 0x0 > >|>(gdb) p head > >|>$2 = (element_t *) 0x0 > >|>(gdb) p new > >|>$3 = (element_t *) 0x20300000 > >|>(gdb) p (head = *list) > >|>$4 = (element_t *) 0x0 > >|>(gdb) p new > >|>$5 = (element_t *) 0x20300000 > >|>(gdb) p new->next > >|>Cannot access memory at address 0x20300000. > >|>(gdb) p new->prev > >|>Cannot access memory at address 0x20300008. > >|>(gdb) p *list = new > >|>$6 = (element_t *) 0x20300000 > >|>(gdb) p new->next > >|>Cannot access memory at address 0x20300000. > >|>(gdb) p new->prev > >|>Cannot access memory at address 0x20300008. > >|> > >|>-- > >|>http://www.bigfoot.com/~brihall > >|>Linux Consultant > >|> > > -- > http://www.bigfoot.com/~brihall > Linux Consultant > > -------------------------------------------------------------------------------- > Name: cmd.table > cmd.table Type: unspecified type (application/octet-stream) > Encoding: quoted-printable > Description: cmd.table From owner-lkcd@oss.sgi.com Thu Apr 27 12:00:32 2000 Received: by oss.sgi.com id ; Thu, 27 Apr 2000 12:00:23 -0700 Received: from ztxmail03.ztx.compaq.com ([161.114.1.207]:52488 "HELO ztxmail03.ztx.compaq.com") by oss.sgi.com with SMTP id ; Thu, 27 Apr 2000 12:00:08 -0700 Received: by ztxmail03.ztx.compaq.com (Postfix, from userid 12345) id 4D4F3459; Thu, 27 Apr 2000 14:00:02 -0500 (CDT) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by ztxmail03.ztx.compaq.com (Postfix) with SMTP id 56A1C19B; Thu, 27 Apr 2000 14:00:01 -0500 (CDT) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA14973; Thu, 27 Apr 2000 13:00:00 -0600 Received: from dhcp32-218.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA27794; Thu, 27 Apr 2000 12:59:59 -0600 Received: by compaq.com (sSMTP sendmail emulation); Thu, 27 Apr 2000 12:54:43 -0600 Content-Length: 6224 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 In-Reply-To: <3908816E.42B3B16B@sgi.com> Date: Thu, 27 Apr 2000 12:54:42 -0600 (MDT) Reply-To: Brian Hall From: Brian Hall To: Tom Morano Subject: Re: Alpha lcrash initialization problem - can't access memory Cc: lkcd@oss.sgi.com, Matt D.Robinson Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I haven't changed anything in main(). After the command options are parsed out, around main.c:198: (dies in register_cmds() ) init_liballoc(0, 0, 0); kl_init_kern_info(); register_cmds(cmdset); arch_init(ofp); Are you saying that init_liballoc() needs different arguments now? I followed the call sequence down for init_liballoc, and it appeared that values other than zero were assigned along the way. Changing to init_liballoc(100,100,100) had no effect (same traceback on the segfault). Upping that to 1000 didn't help. I added "#define ALLOC_DEBUG 1" in main.c and set lcrash_debug=1. I didn't get any extra output like I hoped. Guess I'm not getting far enough... I set a breakpoint at alloc.c:437. The comment here is "Break page into blocks and queue them on freelist". The freelist (p->blklist) is NULL at this point, and the enqueue function dies, although it seems like enqueue was written to handle this case? I believe this is the first time the freelist is added to. Not sure I'm getting far enough to walk off the end of the list, it seems to die on _creating_ the list? Breakpoint 1, get_page (index=4) at alloc.c:437 437 alloc.c: No such file or directory. (gdb) p i $1 = 0 (gdb) p *p $2 = {next = 0x0, prev = 0x0, addr = 0x2000071f000, blklist = 0x0, blksz = 128, nblocks = 32, nfree = 32, state = 2, index = 4, hash = 0x0} (gdb) p (p->addr + i * p->blksz) $3 = (void *) 0x2000071f000 (gdb) p *p->blklist Cannot access memory at address 0x0. (gdb) p p->blklist $4 = (block_t *) 0x0 On 27-Apr-2000 Tom Morano wrote: > Hi Brian, > > From the location of the failure (the SEGV), it sounds as if the block > alligator might not have been initialized properly. The alloc.c module > you refer to is actually lib/liballoc/alloc.c. It's a local block alligator > that allows us to track blocks that might need to be freed in the event of > a longjmp call (because someone hit Ctrl-C during command output). Make sure > you are calling init_liballoc() during your initialization in the main() > function. Also make sure you are passing the correct parameters to the > function (the parameter list changed a while back). The other possibility > is that you are walking off the end of a memory block and trashing memory > behind it. I say this because, at the point where you blow up, you are > dealing with the liballoc control structure memory (which contains a bad > pointer). > > Let me know what you find, > > Tom > > Brian Hall wrote: >> >> Dump of cmdset[] is attached. Appears to match the cmdset table in cmds.c >> exactly. >> >> I haven't altered the list of commands in cmds.c at all. I suspect the >> problem >> is I have missed something in replacing the i386 stuff with Alpha functions. >> I >> haven't changed all the function names that start with "i386", etc but that >> is >> just cosmetic. >> >> On 26-Apr-2000 Matt D. Robinson wrote: >> > On Wed, 26 Apr 2000, Brian Hall wrote: >> >|>OK, Alpha lcrash is dying before it gets very far. Any ideas why it can't >> >|>access the memory in question? >> >|> >> >|>I can see where not being able to access memory being asked for causes a >> >|>segfault, but why the report about "alloc.c: No such file or directory" ? >> > >> > This is probably due to 'gdb's understanding of where alloc.c is >> > found. >> > >> > Looking at this stack trace, have you removed any commands as of late >> > that wouldn't have been used? Looks like something may be wrong with >> > the commands structure. Can you dump out the table? >> > >> > --Matt >> > >> >|>[root@dhcp96-180 lcrash]# gdb ./lcrash >> >|> >> >|>(gdb) run map.0 vmdump.0 -d 1 >> >|>Starting program: >> >|>/CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash >> >|>map.0 vmdump.0 -d 1 >> >|>map = map.0, vmdump = vmdump.0, outfile = stdout >> >|> >> >|>Please wait................... >> >|> >> >|>Program received signal SIGSEGV, Segmentation fault. >> >|>0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at alloc.c:57 >> >|>57 alloc.c: No such file or directory. >> >|>(gdb) where full >> >|>#0 0x12001b110 in enqueue (list=0x1202fc1b8, new=0x20300000) at >> >|>#alloc.c:57 >> >|> head = (element_t *) 0x0 >> >|>#1 0x12001c154 in get_page (index=4) at alloc.c:438 >> >|> i = 0 >> >|> b = (block_t *) 0x20300000 >> >|> page = (void *) 0x12001c624 >> >|> p = (page_t *) 0x1202fc1a0 >> >|>#2 0x12001cacc in alloc_block (size=80, flag=2, ra=0x1e) at alloc.c:695 >> >|> i = 4 >> >|> j = 1 >> >|> blk = (void *) 0xfffffffff7f7ffdb >> >|> p = (page_t *) 0x11ffffad0 >> >|> b = (block_t *) 0x0 >> >|>#3 0x1200038c8 in kl_block_alloc_func (size=80, flag=2, ra=0x1e) at >> >|>#util.c:279 >> >|> b = (void *) 0x12002be14 >> >|>#4 0x12002be6c in _kl_alloc_block (size=80, flags=2, ra=0x1e) at >> >|>#kl_alloc.c:22 >> >|> blk = (void *) 0x120003d4c >> >|>#5 0x120003d6c in register_cmds (cmds=0x120144aa8) at command.c:17 >> >|> i = 0 >> >|> ret = 1 >> >|> max_depth = 539896352 >> >|> cmd_rec = (cmd_rec_t *) 0x0 >> >|>#6 0x120002b20 in main (argc=5, argv=0x11ffffbb8) at main.c:200 >> >|> i = 5 >> >|> c = 512 >> >|> errflg = 0 >> >|>(gdb) p *list >> >|>$1 = (element_t *) 0x0 >> >|>(gdb) p head >> >|>$2 = (element_t *) 0x0 >> >|>(gdb) p new >> >|>$3 = (element_t *) 0x20300000 >> >|>(gdb) p (head = *list) >> >|>$4 = (element_t *) 0x0 >> >|>(gdb) p new >> >|>$5 = (element_t *) 0x20300000 >> >|>(gdb) p new->next >> >|>Cannot access memory at address 0x20300000. >> >|>(gdb) p new->prev >> >|>Cannot access memory at address 0x20300008. >> >|>(gdb) p *list = new >> >|>$6 = (element_t *) 0x20300000 >> >|>(gdb) p new->next >> >|>Cannot access memory at address 0x20300000. >> >|>(gdb) p new->prev >> >|>Cannot access memory at address 0x20300008. >> --------------------------------------------------------------------------- >> ----- >> Name: cmd.table >> cmd.table Type: unspecified type (application/octet-stream) >> Encoding: quoted-printable >> Description: cmd.table -- http://www.bigfoot.com/~brihall Linux Consultant From owner-lkcd@oss.sgi.com Thu Apr 27 14:35:44 2000 Received: by oss.sgi.com id ; Thu, 27 Apr 2000 14:35:34 -0700 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:16755 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Thu, 27 Apr 2000 14:35:13 -0700 Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id OAA04226 for ; Thu, 27 Apr 2000 14:39:23 -0700 (PDT) mail_from (tjm@sgi.com) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id OAA63229; Thu, 27 Apr 2000 14:33:30 -0700 (PDT) Message-ID: <3908B22A.B11911EB@sgi.com> Date: Thu, 27 Apr 2000 14:33:30 -0700 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Brian Hall CC: lkcd@oss.sgi.com, "Matt D.Robinson" Subject: Re: Alpha lcrash initialization problem - can't access memory References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Brian Hall wrote: > > I haven't changed anything in main(). After the command options are parsed out, > around main.c:198: (dies in register_cmds() ) > > init_liballoc(0, 0, 0); > kl_init_kern_info(); > register_cmds(cmdset); > arch_init(ofp); > > Are you saying that init_liballoc() needs different arguments now? I followed > the call sequence down for init_liballoc, and it appeared that values other > than zero were assigned along the way. Changing to init_liballoc(100,100,100) > had no effect (same traceback on the segfault). Upping that to 1000 didn't help. The parameters to init_liballoc() are OK. Based on this, I would guess that some memory is getting stomped on in or below the kl_init_kern_info() function call. You might check the block of memory causing the SEGV after returning from the init_liballoc() call and before the kl_init_kern_info() call. See if it looks OK at that point (I would guess the contents of this memory is change by the time you get to register_cmds()). If that's the case, then walk through the kl_init_kern_info() function and see where the memory contents changes. From looking at the kl_init_kern_info() function, I can't see where the problem might occur (it basically just does symbol lookups and reads in the contents of memory into some local variables). Since the Alpha is 64 bit, I assume that the amount of memory being read in for these values is 8 bytes instead of 4 (and that the local variables, NUM_PHYSPAGES and MEM_MAP have been changed also). Little things like that might be a factor. Anyway, that's how I would approach narrowing it down. Tom From owner-lkcd@oss.sgi.com Fri Apr 28 07:02:24 2000 Received: by oss.sgi.com id ; Fri, 28 Apr 2000 07:02:13 -0700 Received: from smtp1.cern.ch ([137.138.128.38]:2308 "EHLO smtp1.cern.ch") by oss.sgi.com with ESMTP id ; Fri, 28 Apr 2000 07:01:51 -0700 Received: from asis-w2.cern.ch (asis-w2.cern.ch [137.138.33.50]) by smtp1.cern.ch (8.9.3/8.9.3) with ESMTP id QAA25009 for ; Fri, 28 Apr 2000 16:01:43 +0200 (MET DST) Received: (from iven@localhost) by asis-w2.cern.ch (8.9.3/8.9.3) id QAA01904; Fri, 28 Apr 2000 16:01:41 +0200 X-Authentication-Warning: asis-w2.cern.ch: iven set sender to jan.iven@cern.ch using -f To: lkcd@oss.sgi.com Subject: Re: lkcd status? References: <3907172C.3C7CC593@sgi.com> From: Jan IVEN In-Reply-To: Tom Morano's message of "Wed, 26 Apr 2000 09:19:56 -0700" Date: 28 Apr 2000 16:01:38 +0200 Message-ID: Lines: 16 User-Agent: Gnus/5.0803 (Gnus v5.8.3) Emacs/20.6 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing >>>>> "Tom" == Tom Morano writes: ... Tom> porting LKCD to 2.3.x. I know that from an lcrash point of view, Tom> you can grab the latest source (for 2.2 and 2.3) from Tom> SourceForge (project lkcd). With the 2.3 version of lcrash, you Tom> can run lcrash against a live system and can now generate a Tom> compressed dump from live system memory -- without crashing the Tom> system (I'm going to back port this feature to 2.2). Matt can I'd certainly be interested to give this feature a spin on 2.2. Could you provide some details as to when it will be available? Besides, is there new information on the raw io issue for IDE? Regards Jan From owner-lkcd@oss.sgi.com Fri Apr 28 07:32:34 2000 Received: by oss.sgi.com id ; Fri, 28 Apr 2000 07:32:25 -0700 Received: from mailserv.nbnet.nb.ca ([198.164.200.18]:44165 "EHLO quartz.nbnet.nb.ca") by oss.sgi.com with ESMTP id ; Fri, 28 Apr 2000 07:32:05 -0700 Received: from Lxxxx.nbtel.nb.ca ([142.166.194.109]) by quartz.nbnet.nb.ca (Post.Office MTA v3.5.3 release 223 ID# 0-66826U105000L105000S0V35) with ESMTP id ca for ; Fri, 28 Apr 2000 11:32:02 -0300 From: "Marco Shaw" To: Subject: How to crash system? Date: Fri, 28 Apr 2000 11:30:49 -0300 X-MSMail-Priority: Normal X-Priority: 3 X-Mailer: Microsoft Internet Mail 4.70.1155 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Message-ID: <20000428143202.AAA22395@quartz.nbnet.nb.ca@Lxxxx.nbtel.nb.ca> Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I want to run through some crash scenarios during an upcoming presentation. So I need to crash my system and produce a good dump. How could I produce such results with the 2.2.13 SGI-enhanced kernel? Thanks, Marco From owner-lkcd@oss.sgi.com Fri Apr 28 08:11:53 2000 Received: by oss.sgi.com id ; Fri, 28 Apr 2000 08:11:44 -0700 Received: from mail.turbolinux.com ([38.170.88.25]:48138 "EHLO mail.turbolinux.com") by oss.sgi.com with ESMTP id ; Fri, 28 Apr 2000 08:11:29 -0700 Received: from localhost (yakker@localhost) by mail.turbolinux.com (8.9.3/8.9.3) with ESMTP id IAA16391; Fri, 28 Apr 2000 08:09:27 -0700 Date: Fri, 28 Apr 2000 08:09:27 -0700 (PDT) From: "Matt D. Robinson" To: Marco Shaw cc: lkcd@oss.sgi.com Subject: Re: How to crash system? In-Reply-To: <20000428143202.AAA22395@quartz.nbnet.nb.ca@Lxxxx.nbtel.nb.ca> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hey, Marco. The FAQ points to a couple of ways to generate system crash dumps. You can always put in hooks into the kernel, or you can write some bad code and just have it called in normal kernel execution. :) The FAQ is at: http://oss.sgi.com/projects/lkcd/faq.html Let me know if you need more specifics. --Matt On Fri, 28 Apr 2000, Marco Shaw wrote: |>I want to run through some crash scenarios during an upcoming presentation. |> So I need to crash my system and produce a good dump. |> |>How could I produce such results with the 2.2.13 SGI-enhanced kernel? |> |>Thanks, |>Marco |> From owner-lkcd@oss.sgi.com Fri Apr 28 08:18:34 2000 Received: by oss.sgi.com id ; Fri, 28 Apr 2000 08:18:24 -0700 Received: from mail.turbolinux.com ([38.170.88.25]:57866 "EHLO mail.turbolinux.com") by oss.sgi.com with ESMTP id ; Fri, 28 Apr 2000 08:18:08 -0700 Received: from localhost (yakker@localhost) by mail.turbolinux.com (8.9.3/8.9.3) with ESMTP id IAA16580; Fri, 28 Apr 2000 08:16:08 -0700 Date: Fri, 28 Apr 2000 08:16:08 -0700 (PDT) From: "Matt D. Robinson" To: Jan IVEN cc: lkcd@oss.sgi.com Subject: Re: lkcd status? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On 28 Apr 2000, Jan IVEN wrote: |>>>>>> "Tom" == Tom Morano writes: |>... |> Tom> porting LKCD to 2.3.x. I know that from an lcrash point of view, |> Tom> you can grab the latest source (for 2.2 and 2.3) from |> Tom> SourceForge (project lkcd). With the 2.3 version of lcrash, you |> Tom> can run lcrash against a live system and can now generate a |> Tom> compressed dump from live system memory -- without crashing the |> Tom> system (I'm going to back port this feature to 2.2). Matt can |> |>I'd certainly be interested to give this feature a spin on 2.2. Could |>you provide some details as to when it will be available? |> |>Besides, is there new information on the raw io issue for IDE? The 2.3.X version of LKCD will work with IDE disks. The map_kernel_kiobuf() mechanisms are now available to use, which means we can map kernel pages to kiobufs and use the brw_kiovec() function. --Matt |>Regards |>Jan |> From owner-lkcd@oss.sgi.com Fri Apr 28 09:08:14 2000 Received: by oss.sgi.com id ; Fri, 28 Apr 2000 09:08:04 -0700 Received: from smtp1.cern.ch ([137.138.128.38]:62477 "EHLO smtp1.cern.ch") by oss.sgi.com with ESMTP id ; Fri, 28 Apr 2000 09:07:48 -0700 Received: from asis-w2.cern.ch (asis-w2.cern.ch [137.138.33.50]) by smtp1.cern.ch (8.9.3/8.9.3) with ESMTP id SAA04563; Fri, 28 Apr 2000 18:07:41 +0200 (MET DST) Received: (from iven@localhost) by asis-w2.cern.ch (8.9.3/8.9.3) id SAA05752; Fri, 28 Apr 2000 18:07:39 +0200 X-Authentication-Warning: asis-w2.cern.ch: iven set sender to jan.iven@cern.ch using -f To: "Marco Shaw" Cc: Subject: Re: How to crash system? References: <20000428143202.AAA22395@quartz.nbnet.nb.ca@Lxxxx.nbtel.nb.ca> From: Jan IVEN In-Reply-To: "Marco Shaw"'s message of "Fri, 28 Apr 2000 11:30:49 -0300" Date: 28 Apr 2000 18:07:39 +0200 Message-ID: Lines: 58 User-Agent: Gnus/5.0803 (Gnus v5.8.3) Emacs/20.6 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing >>>>> "Marco" == Marco Shaw writes: Marco> I want to run through some crash scenarios during an upcoming Marco> presentation. So I need to crash my system and produce a good Marco> dump. Marco> How could I produce such results with the 2.2.13 SGI-enhanced Marco> kernel? No idea about the SGI-enhanced kernel. Besides from the things in the LKCD FAQ, I tried these (but not on LKCD, since it doesn't support IDE on 2.2 kernels): + a magic SysRq for crash: doesn't work reliably, since it is being called inside the keyboard interrupt handler -- this leads to the interrupt handler being killed most of the time, nothing will be written to disk afterwards. Might work with the "in memory" dump facilities of mclinux, didn't try these. The mclinux crash dump patch already contains this sysrq. + a dummy module that calls panic: works for me. Just compile and insmod. Please share your experiences. Regards Jan /* crash.c * adapted from hello.c ("Hello, world" module) by Ori Pomerantz */ /* The necessary header files */ /* Standard in kernel modules */ #include /* We're doing kernel work */ #include /* Specifically, a module */ #include /* Deal with CONFIG_MODVERSIONS */ #if CONFIG_MODVERSIONS==1 #define MODVERSIONS #include #endif /* Initialize the module */ int init_module() { printk("Goodbye, World: calling panic()\n"); panic("poisoned module"); /* shouldn't reach this point */ return 0; } /* Cleanup - undid whatever init_module did */ void cleanup_module() { printk("Short is the life of a kernel module\n"); }