From owner-lkcd@oss.sgi.com Fri Nov 3 13:46:51 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 13:46:30 -0800 Received: from msgbas1x.cos.agilent.com ([192.6.9.33]:36292 "HELO msgbas1.cos.agilent.com") by oss.sgi.com with SMTP id ; Fri, 3 Nov 2000 13:46:20 -0800 Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77]) by msgbas1.cos.agilent.com (Postfix) with ESMTP id 7182E2D5 for ; Fri, 3 Nov 2000 14:46:19 -0700 (MST) Received: from axcsbh4.cos.agilent.com (axcsbh4.cos.agilent.com [130.29.152.145]) by msgrel1.cos.agilent.com (Postfix) with SMTP id C959021 for ; Fri, 3 Nov 2000 14:46:17 -0700 (MST) Received: from 130.29.152.145 by axcsbh4.cos.agilent.com (InterScan E-Mail VirusWall NT); Fri, 03 Nov 2000 14:46:16 -0700 (Mountain Standard Time) Received: by axcsbh4.cos.agilent.com with Internet Mail Service (5.5.2650.21) id ; Fri, 3 Nov 2000 14:46:16 -0700 Message-ID: From: hiren_mehta@agilent.com To: lkcd@oss.sgi.com Subject: how to make kernel do system dump ? Date: Fri, 3 Nov 2000 14:46:15 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="ISO-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi All, I applied the patches for system crash dump on my linux system. Now, I want to make sure that the dump is working properly. does anybody know how to initiate the crash dump on LINUX ? THanks -hiren From owner-lkcd@oss.sgi.com Fri Nov 3 14:11:50 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 14:11:30 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:61220 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Fri, 3 Nov 2000 14:11:14 -0800 Received: from nodin.corp.sgi.com (fddi-nodin.corp.sgi.com [198.29.75.193]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id OAA06585 for ; Fri, 3 Nov 2000 14:03:25 -0800 (PST) mail_from (tjm@sgi.com) Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by nodin.corp.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id OAA24313 for ; Fri, 3 Nov 2000 14:09:28 -0800 (PST) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id OAA10169; Fri, 3 Nov 2000 14:08:12 -0800 (PST) Message-ID: <3A03374B.9CC7ED44@sgi.com> Date: Fri, 03 Nov 2000 14:08:11 -0800 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: hiren_mehta@agilent.com CC: lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing hiren_mehta@agilent.com wrote: > > Hi All, > > I applied the patches for system crash dump on my linux system. > Now, I want to make sure that the dump is working properly. > does anybody know how to initiate the crash dump on LINUX ? > > THanks > -hiren Information on how to modify your kernel so that you can force a crash from a user application can be found in the following FAQ: http://oss.sgi.com/projects/lkcd/faq.html It contains an example of how you to modify the sys_setpriority() function so that, it causes either a panic or a SEGV trap. There is also a sample program that calls setpriority() to initiate the dump. Tom From owner-lkcd@oss.sgi.com Fri Nov 3 14:29:50 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 14:29:40 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:58886 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Fri, 3 Nov 2000 14:29:17 -0800 Received: from [10.1.1.197] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id yzdkaaaa for ; Fri, 3 Nov 2000 14:26:43 -0800 Message-ID: <3A02DB70.AD886058@alacritech.com> Date: Fri, 03 Nov 2000 07:36:16 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: hiren_mehta@agilent.com CC: lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing hiren_mehta@agilent.com wrote: > > Hi All, > > I applied the patches for system crash dump on my linux system. > Now, I want to make sure that the dump is working properly. > does anybody know how to initiate the crash dump on LINUX ? > > THanks > -hiren Tom has given the answer to this, but I wanted to ask: Which kernel are you using? Which patch version are you using (1.1BETA? That's the best one to use for 2.2 stuff). --Matt From owner-lkcd@oss.sgi.com Fri Nov 3 14:32:31 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 14:32:20 -0800 Received: from msgbas1x.cos.agilent.com ([192.6.9.33]:57036 "HELO msgbas1.cos.agilent.com") by oss.sgi.com with SMTP id ; Fri, 3 Nov 2000 14:32:15 -0800 Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77]) by msgbas1.cos.agilent.com (Postfix) with ESMTP id 68FCB3E6; Fri, 3 Nov 2000 15:32:14 -0700 (MST) Received: from axcsbh4.cos.agilent.com (axcsbh4.cos.agilent.com [130.29.152.145]) by msgrel1.cos.agilent.com (Postfix) with SMTP id CEAF71F; Fri, 3 Nov 2000 15:32:13 -0700 (MST) Received: from 130.29.152.145 by axcsbh4.cos.agilent.com (InterScan E-Mail VirusWall NT); Fri, 03 Nov 2000 15:32:13 -0700 (Mountain Standard Time) Received: by axcsbh4.cos.agilent.com with Internet Mail Service (5.5.2650.21) id ; Fri, 3 Nov 2000 15:32:13 -0700 Message-ID: From: hiren_mehta@agilent.com To: yakker@alacritech.com Cc: lkcd@oss.sgi.com Subject: RE: how to make kernel do system dump ? Date: Fri, 3 Nov 2000 15:32:12 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="ISO-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I am using 2.2.16 kernel and lkcd_2_2_13-1_0_4_patch. Where can I find 1.1 beta patch ? -hiren > -----Original Message----- > From: Matt D. Robinson [mailto:yakker@alacritech.com] > Sent: Friday, November 03, 2000 7:36 AM > To: hiren_mehta@agilent.com > Cc: lkcd@oss.sgi.com > Subject: Re: how to make kernel do system dump ? > > > hiren_mehta@agilent.com wrote: > > > > Hi All, > > > > I applied the patches for system crash dump on my linux system. > > Now, I want to make sure that the dump is working properly. > > does anybody know how to initiate the crash dump on LINUX ? > > > > THanks > > -hiren > > Tom has given the answer to this, but I wanted to ask: > > Which kernel are you using? > Which patch version are you using (1.1BETA? That's the best one > to use for 2.2 stuff). > > --Matt > From owner-lkcd@oss.sgi.com Fri Nov 3 14:54:41 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 14:54:21 -0800 Received: from Cantor.suse.de ([194.112.123.193]:31502 "HELO Cantor.suse.de") by oss.sgi.com with SMTP id ; Fri, 3 Nov 2000 14:53:51 -0800 Received: from Hermes.suse.de (Hermes.suse.de [194.112.123.136]) by Cantor.suse.de (Postfix) with ESMTP id 261E61E2A3; Fri, 3 Nov 2000 23:53:48 +0100 (MET) Received: from gruyere.muc.suse.de (unknown [10.23.1.2]) by Hermes.suse.de (Postfix) with ESMTP id E3A983E47D; Fri, 3 Nov 2000 23:53:47 +0100 (MET) Received: by gruyere.muc.suse.de (Postfix, from userid 14446) id D7E292F300; Fri, 3 Nov 2000 23:53:46 +0100 (MET) Date: Fri, 3 Nov 2000 23:53:46 +0100 From: Andi Kleen To: Tom Morano Cc: hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? Message-ID: <20001103235346.A5125@gruyere.muc.suse.de> References: <3A03374B.9CC7ED44@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3A03374B.9CC7ED44@sgi.com>; from tjm@sgi.com on Fri, Nov 03, 2000 at 02:08:11PM -0800 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Fri, Nov 03, 2000 at 02:08:11PM -0800, Tom Morano wrote: > It contains an example of how you to modify the sys_setpriority() > function so that, it causes either a panic or a SEGV trap. There is > also a sample program that calls setpriority() to initiate the dump. That sounds rather complicated. Why don't you just load a small kernel module for it ? #include int init_module(void) { panic("Dump"); } gcc -O2 -DMODULE -c module.c insmod module.o -Andi From owner-lkcd@oss.sgi.com Fri Nov 3 15:04:10 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 15:03:51 -0800 Received: from thor.fsc-usa.com ([63.109.18.10]:11014 "EHLO thor.fsc-usa.com") by oss.sgi.com with ESMTP id ; Fri, 3 Nov 2000 15:03:35 -0800 Received: from one.fsc-usa.com (one-internal.eng.pyramid.com [172.25.216.10]) by thor.fsc-usa.com (8.10.1/18SEP00--Fujitsu-Siemens-of-America-Gateway) id eA403X601383 From: To: <>; Fri, 3 Nov 2000 16:03:33 -0800 Received: from tomb.fsc-usa.com by one.fsc-usa.com (8.8.5/FSC_USA_Internal_Configuration) id PAA25312; Fri, 3 Nov 2000 15:03:25 -0800 (PST) Received: by tomb.fsc-usa.com (8.8.5/Pyramid_Internal_Configuration) id PAA07244; Fri, 3 Nov 2000 15:03:00 -0800 (PST) From: hkannan@fsc-usa.com (Hari Kannan) Message-Id: <200011032303.PAA07244@tomb.fsc-usa.com> Subject: Re: how to make kernel do system dump ? To: ak@suse.de (Andi Kleen) Date: Fri, 3 Nov 2000 15:03:00 -0800 (PST) Cc: tjm@sgi.com (Tom Morano), hiren_mehta@agilent.com, lkcd@oss.sgi.com In-Reply-To: <20001103235346.A5125@gruyere.muc.suse.de> from "Andi Kleen" at Nov 03, 2000 11:53:46 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Isnt it true that lkcd cannot show stack backtraces for loadable modules? Hari > > On Fri, Nov 03, 2000 at 02:08:11PM -0800, Tom Morano wrote: > > It contains an example of how you to modify the sys_setpriority() > > function so that, it causes either a panic or a SEGV trap. There is > > also a sample program that calls setpriority() to initiate the dump. > > That sounds rather complicated. Why don't you just load a small kernel > module for it ? > > #include > int init_module(void) > { > panic("Dump"); > } > > > gcc -O2 -DMODULE -c module.c > insmod module.o > > > > > -Andi > From owner-lkcd@oss.sgi.com Fri Nov 3 15:26:41 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 15:26:31 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:24127 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Fri, 3 Nov 2000 15:26:12 -0800 Received: from nodin.corp.sgi.com (fddi-nodin.corp.sgi.com [198.29.75.193]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id PAA19669 for ; Fri, 3 Nov 2000 15:18:23 -0800 (PST) mail_from (tjm@sgi.com) Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by nodin.corp.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id PAA48838 for ; Fri, 3 Nov 2000 15:24:26 -0800 (PST) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id PAA10363; Fri, 3 Nov 2000 15:23:10 -0800 (PST) Message-ID: <3A0348DE.2D026ADE@sgi.com> Date: Fri, 03 Nov 2000 15:23:10 -0800 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Andi Kleen CC: hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: <3A03374B.9CC7ED44@sgi.com> <20001103235346.A5125@gruyere.muc.suse.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing You could of course do that. The code included in the FAQ is an example of how you COULD force a dump. I know from my own experience, that a module approach may not always work. I'm currently building ia64 kernels on one system and then booting them on another (without module support configured). Thanks for this alternative suggestion though. Tom Andi Kleen wrote: > > On Fri, Nov 03, 2000 at 02:08:11PM -0800, Tom Morano wrote: > > It contains an example of how you to modify the sys_setpriority() > > function so that, it causes either a panic or a SEGV trap. There is > > also a sample program that calls setpriority() to initiate the dump. > > That sounds rather complicated. Why don't you just load a small kernel > module for it ? > > #include > int init_module(void) > { > panic("Dump"); > } > > gcc -O2 -DMODULE -c module.c > insmod module.o > > -Andi From owner-lkcd@oss.sgi.com Fri Nov 3 16:31:14 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 16:30:54 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:45316 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Fri, 3 Nov 2000 16:30:40 -0800 Received: from [10.1.1.197] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id wfekaaaa for ; Fri, 3 Nov 2000 16:28:08 -0800 Message-ID: <3A02F7E4.5BEF806D@alacritech.com> Date: Fri, 03 Nov 2000 09:37:40 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: Hari Kannan CC: Andi Kleen , Tom Morano , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: <200011032303.PAA07244@tomb.fsc-usa.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing For now, this is correct. Hopefully, this will change now that we're moving to a new model of development -- we're removing most of the dependencies on building 'lcrash' with the right header files and moving to something more dynamic from a single object header build. Check out the latest stuff in the 2.4 tree. BTW, Andi, did 2.4 change the scheduler now so that you don't have to get tasklist_lock to avoid having jobs scheduled underneath you (in a panic()/interrupt state)? --Matt Hari Kannan wrote: > > Isnt it true that lkcd cannot show stack backtraces for loadable modules? > > Hari > > > > > On Fri, Nov 03, 2000 at 02:08:11PM -0800, Tom Morano wrote: > > > It contains an example of how you to modify the sys_setpriority() > > > function so that, it causes either a panic or a SEGV trap. There is > > > also a sample program that calls setpriority() to initiate the dump. > > > > That sounds rather complicated. Why don't you just load a small kernel > > module for it ? > > > > #include > > int init_module(void) > > { > > panic("Dump"); > > } > > > > > > gcc -O2 -DMODULE -c module.c > > insmod module.o > > > > > > > > > > -Andi > > From owner-lkcd@oss.sgi.com Fri Nov 3 19:53:55 2000 Received: by oss.sgi.com id ; Fri, 3 Nov 2000 19:53:35 -0800 Received: from pallas.veritas.com ([204.177.156.25]:12238 "EHLO pallas.veritas.com") by oss.sgi.com with ESMTP id ; Fri, 3 Nov 2000 19:53:05 -0800 Received: from megami.veritas.com (megami.veritas.com [192.203.46.101]) by pallas.veritas.com (8.9.1a/8.9.1) with SMTP id TAA18530; Fri, 3 Nov 2000 19:53:44 -0800 (PST) Received: from muppetlabs.com([172.22.5.154]) (1701 bytes) by megami.veritas.com via sendmail with P:esmtp/R:smart_host/T:smtp (sender: ) id for ; Fri, 3 Nov 2000 19:52:59 -0800 (PST) (Smail-3.2.0.101 1997-Dec-17 #4 built 1999-Aug-24) Message-ID: <3A0387D4.EB03D359@muppetlabs.com> Date: Fri, 03 Nov 2000 19:51:48 -0800 From: Amit D Chaudhary X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.17 i686) X-Accept-Language: en MIME-Version: 1.0 To: hiren_mehta@agilent.com CC: lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing ftp://oss.sgi.com/projects/lkcd/download/1.1BETA/ And ftp://oss.sgi.com/projects/lkcd/download/ is the main download directory. Amit hiren_mehta@agilent.com wrote: > > I am using 2.2.16 kernel and lkcd_2_2_13-1_0_4_patch. > > Where can I find 1.1 beta patch ? > > -hiren > > > -----Original Message----- > > From: Matt D. Robinson [mailto:yakker@alacritech.com] > > Sent: Friday, November 03, 2000 7:36 AM > > To: hiren_mehta@agilent.com > > Cc: lkcd@oss.sgi.com > > Subject: Re: how to make kernel do system dump ? > > > > > > hiren_mehta@agilent.com wrote: > > > > > > Hi All, > > > > > > I applied the patches for system crash dump on my linux system. > > > Now, I want to make sure that the dump is working properly. > > > does anybody know how to initiate the crash dump on LINUX ? > > > > > > THanks > > > -hiren > > > > Tom has given the answer to this, but I wanted to ask: > > > > Which kernel are you using? > > Which patch version are you using (1.1BETA? That's the best one > > to use for 2.2 stuff). > > > > --Matt > > From owner-lkcd@oss.sgi.com Sat Nov 4 00:47:06 2000 Received: by oss.sgi.com id ; Sat, 4 Nov 2000 00:46:46 -0800 Received: from Cantor.suse.de ([194.112.123.193]:31250 "HELO Cantor.suse.de") by oss.sgi.com with SMTP id ; Sat, 4 Nov 2000 00:46:26 -0800 Received: from Hermes.suse.de (Hermes.suse.de [194.112.123.136]) by Cantor.suse.de (Postfix) with ESMTP id 3F6641E094; Sat, 4 Nov 2000 09:46:24 +0100 (MET) Received: from gruyere.muc.suse.de (unknown [10.23.1.2]) by Hermes.suse.de (Postfix) with ESMTP id 82E193E476; Sat, 4 Nov 2000 09:46:23 +0100 (MET) Received: by gruyere.muc.suse.de (Postfix, from userid 14446) id 0E97A2F300; Sat, 4 Nov 2000 09:46:22 +0100 (MET) Date: Sat, 4 Nov 2000 09:46:22 +0100 From: Andi Kleen To: "Matt D. Robinson" Cc: Hari Kannan , Andi Kleen , Tom Morano , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? Message-ID: <20001104094622.A10698@gruyere.muc.suse.de> References: <200011032303.PAA07244@tomb.fsc-usa.com> <3A02F7E4.5BEF806D@alacritech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3A02F7E4.5BEF806D@alacritech.com>; from yakker@alacritech.com on Fri, Nov 03, 2000 at 09:37:40AM -0800 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Fri, Nov 03, 2000 at 09:37:40AM -0800, Matt D. Robinson wrote: > BTW, Andi, did 2.4 change the scheduler now so that you don't have > to get tasklist_lock to avoid having jobs scheduled underneath you > (in a panic()/interrupt state)? The current CPU cannot be rescheduled because panic is hanging in kernel mode. The panic also sends a stop IPI to the other CPUs, but until the IPI is processed there may be some scheduling. The IPI send function smp_call_function was also fixed to never schedule (it previously took a semaphore which sometimes could lead to the panic thread calling schedule) So it should be ok now. -Andi From owner-lkcd@oss.sgi.com Sat Nov 4 22:53:20 2000 Received: by oss.sgi.com id ; Sat, 4 Nov 2000 22:53:10 -0800 Received: from agni.wipinfo.soft.net ([164.164.6.20]:5521 "EHLO agni.wipinfo.soft.net") by oss.sgi.com with ESMTP id ; Sat, 4 Nov 2000 22:52:57 -0800 Received: from vayu.wipinfo.soft.net (vayu [192.168.200.170]) by agni.wipinfo.soft.net (8.9.3/8.9.3) with ESMTP id MAA12735 for ; Sun, 5 Nov 2000 12:17:07 +0500 (GMT) Received: from platinum.mail.wipro.com ([192.168.223.18]) by vayu.wipinfo.soft.net (8.9.3/8.9.3) with ESMTP id MAA04911 for ; Sun, 5 Nov 2000 12:20:29 +0500 (GMT) Received: from wipro.com ([192.168.205.16]) by platinum.mail.wipro.com (Netscape Messaging Server 3.6) with ESMTP id AAA3A04; Sun, 5 Nov 2000 12:22:14 +0530 Message-ID: <3A050401.E8170339@wipro.com> Date: Sun, 05 Nov 2000 12:23:53 +0530 From: Thiruvengada Govindan Organization: Wipro Ltd. X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-12 i686) X-Accept-Language: en MIME-Version: 1.0 To: Tom Morano CC: Andi Kleen , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: <3A03374B.9CC7ED44@sgi.com> <20001103235346.A5125@gruyere.muc.suse.de> <3A0348DE.2D026ADE@sgi.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi, A possibly useful feature would be for the Linux KDB to support a command to initiate a system dump (i haven't looked at the latest version though so please excuse if this is already supported). A lot of times when a system appears hung doing an insmod would just not be feasible and rather than tweak the kernel to panic and thereby force the system to dump it may just be better for KDB itself to support say a "sysdump" command that would take a dump of the system image before reboot. Also how about having a "live dump" feature for Linux ?? If testing lkcd is all that you want to do then why go all over panic'ing and dumping the system why not just do a live dump of the system to the dump device and use the saved dump. Govindan Tom Morano wrote: > > You could of course do that. The code included in the FAQ is an > example of how you COULD force a dump. I know from my own experience, > that a module approach may not always work. I'm currently building > ia64 kernels on one system and then booting them on another (without > module support configured). Thanks for this alternative suggestion > though. > > Tom > > Andi Kleen wrote: > > > > On Fri, Nov 03, 2000 at 02:08:11PM -0800, Tom Morano wrote: > > > It contains an example of how you to modify the sys_setpriority() > > > function so that, it causes either a panic or a SEGV trap. There is > > > also a sample program that calls setpriority() to initiate the dump. > > > > That sounds rather complicated. Why don't you just load a small kernel > > module for it ? > > > > #include > > int init_module(void) > > { > > panic("Dump"); > > } > > > > gcc -O2 -DMODULE -c module.c > > insmod module.o > > > > -Andi From owner-lkcd@oss.sgi.com Sat Nov 4 23:18:19 2000 Received: by oss.sgi.com id ; Sat, 4 Nov 2000 23:18:10 -0800 Received: from ppp0.ocs.com.au ([203.34.97.3]:56068 "HELO mail.ocs.com.au") by oss.sgi.com with SMTP id ; Sat, 4 Nov 2000 23:17:51 -0800 Received: (qmail 24271 invoked from network); 5 Nov 2000 07:17:37 -0000 Received: from ocs3.ocs-net (192.168.255.3) by mail.ocs.com.au with SMTP; 5 Nov 2000 07:17:37 -0000 X-Mailer: exmh version 2.1.1 10/15/1999 From: Keith Owens To: Thiruvengada Govindan cc: lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? In-reply-to: Your message of "Sun, 05 Nov 2000 12:23:53 +0530." <3A050401.E8170339@wipro.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 05 Nov 2000 18:17:37 +1100 Message-ID: <10290.973408657@ocs3.ocs-net> Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Sun, 05 Nov 2000 12:23:53 +0530, Thiruvengada Govindan wrote: > A possibly useful feature would be for the Linux KDB to support a >command to initiate a system dump (i haven't looked at the latest >version though so please excuse if this is already supported). A lot of >times when a system appears hung doing an insmod would just not be >feasible and rather than tweak the kernel to panic and thereby force the >system to dump it may just be better for KDB itself to support say a >"sysdump" command that would take a dump of the system image before >reboot. kdb forces every cpu into kdb state and stops the entire kernel dead in its tracks. This is deliberate, one of kdb's design criteria is to keep the kernel as static as possible while you debug it. That means no interrupts, not even the timer tick runs when kdb is in control. Without interrupts, you cannot do I/O to disk unless your I/O subsystem can run without any interrupt support. You cannot even sleep, all work must be busy wait. If lkcd can run without interrupts then kdb could call lkcd directly. Otherwise the only option is to exit kdb via 'go some_function_name'. 'go' will restart all the cpus and allow interrupts, it is then up to the function to invoke lkcd. The function must be defined as void some_function_name(void) and must never return, the stack at that point will not be in a fit state to return to anything. It is up to the lkcd group to define this interface function. From owner-lkcd@oss.sgi.com Sun Nov 5 11:10:03 2000 Received: by oss.sgi.com id ; Sun, 5 Nov 2000 11:09:53 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:37205 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 5 Nov 2000 11:09:34 -0800 Received: from nodin.corp.sgi.com (nodin.corp.sgi.com [192.26.51.193]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id LAA20917 for ; Sun, 5 Nov 2000 11:01:44 -0800 (PST) mail_from (tjm@sgi.com) Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by nodin.corp.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id LAA72728 for ; Sun, 5 Nov 2000 11:09:03 -0800 (PST) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id LAA13513; Sun, 5 Nov 2000 11:05:46 -0800 (PST) Message-ID: <3A05AF8A.C168DE5@sgi.com> Date: Sun, 05 Nov 2000 11:05:46 -0800 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Thiruvengada Govindan CC: Andi Kleen , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: <3A03374B.9CC7ED44@sgi.com> <20001103235346.A5125@gruyere.muc.suse.de> <3A0348DE.2D026ADE@sgi.com> <3A050401.E8170339@wipro.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Thiruvengada Govindan wrote: > > Hi, > A possibly useful feature would be for the Linux KDB to support a > command to initiate a system dump (i haven't looked at the latest > version though so please excuse if this is already supported). A lot of > times when a system appears hung doing an insmod would just not be > feasible and rather than tweak the kernel to panic and thereby force the > system to dump it may just be better for KDB itself to support say a > "sysdump" command that would take a dump of the system image before > reboot. > > Also how about having a "live dump" feature for Linux ?? If testing lkcd > is all that you want to do then why go all over panic'ing and dumping > the system why not just do a live dump of the system to the dump device > and use the saved dump. There is a 'livedump' command in lcrash. It runs through all memory pages and dumps them in the same format that the system uses (compressed). Of course this won't test the kernel mechanisms for dumping core. From a kernel point of view, the real challenge is having things work when there HAS been a panic. That is when you would most expect normal behavior to have problems. Having a live dump facility in the kernel would not address this level of testing. BTW, the only thing you can't do with the livedump from lcrash is analyze the running task (lcrash itself), obviously. Thanks, Tom From owner-lkcd@oss.sgi.com Mon Nov 6 02:35:48 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 02:35:38 -0800 Received: from d06lmsgate-3.uk.ibm.com ([195.212.29.3]:21466 "EHLO d06lmsgate-3.uk.ibm.com") by oss.sgi.com with ESMTP id ; Mon, 6 Nov 2000 02:35:21 -0800 Received: from d06relay02.portsmouth.uk.ibm.com (d06relay02.portsmouth.uk.ibm.com [9.166.84.148]) by d06lmsgate-3.uk.ibm.com (1.0.0) with ESMTP id KAA180272; Mon, 6 Nov 2000 10:26:50 GMT From: richardj_moore@uk.ibm.com Received: from d06mta06.portsmouth.uk.ibm.com (d06mta06_cs0 [9.180.35.4]) by d06relay02.portsmouth.uk.ibm.com (8.8.8m3/NCO v4.95) with SMTP id KAA194362; Mon, 6 Nov 2000 10:35:02 GMT Received: by d06mta06.portsmouth.uk.ibm.com(Lotus SMTP MTA v4.6.5 (863.2 5-20-1999)) id 8025698F.003A235E ; Mon, 6 Nov 2000 10:35:01 +0000 X-Lotus-FromDomain: IBMGB To: hiren_mehta@agilent.com cc: lkcd@oss.sgi.com Message-ID: <8025698F.003A0E21.00@d06mta06.portsmouth.uk.ibm.com> Date: Mon, 6 Nov 2000 09:34:16 +0000 Subject: Re: how to make kernel do system dump ? Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing You can also invoke crash dump at will from any code location by installing DProbes and involking the "exit 4" command. Richard Richard Moore - RAS Project Lead - Linux Technology Centre (PISC). http://oss.software.ibm.com/developerworks/opensource/linux Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183 IBM UK Ltd, MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK From owner-lkcd@oss.sgi.com Mon Nov 6 06:16:48 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 06:16:38 -0800 Received: from mail.missioncriticallinux.com ([208.51.139.18]:36615 "EHLO missioncriticallinux.com") by oss.sgi.com with ESMTP id ; Mon, 6 Nov 2000 06:16:13 -0800 Received: from mclinux.com (IDENT:anderson@anderson.lowell.mclinux.com [10.1.8.20]) by missioncriticallinux.com (8.9.3/8.9.3) with ESMTP id JAA09337 for ; Mon, 6 Nov 2000 09:16:06 -0500 Message-ID: <3A06C9B5.A265CD37@mclinux.com> Date: Mon, 06 Nov 2000 10:09:41 -0500 From: Dave Anderson Organization: Mission Critical Linux X-Mailer: Mozilla 4.74 [en] (X11; U; Linux 2.2.5-15smp2 i686) X-Accept-Language: en MIME-Version: 1.0 To: lkcd@oss.sgi.com Subject: Re: How to make a kernel do a system dump Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing For anything short of a hang situation, you can always force a panic from user space by: (1) determining the address of a task's task_struct.pid (2) open /dev/mem for writing, (3) write a zero into its pid location, and then have the targeted task exit. This initiates an "Attempted to kill the idle task!" panic from do_exit(). I've made it a command option in our MCLX crash utility -- which modifies its own pid and then exits; LKCD lcrash could easily be tweaked to do the same. Dave Anderson Mission Critical Linux From owner-lkcd@oss.sgi.com Mon Nov 6 11:30:01 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 11:29:52 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:5388 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Mon, 6 Nov 2000 11:29:34 -0800 Received: from [10.1.1.194] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id vmgkaaaa for ; Mon, 6 Nov 2000 11:26:57 -0800 Message-ID: <3A06A5D2.D08F2597@alacritech.com> Date: Mon, 06 Nov 2000 04:36:34 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: Andi Kleen CC: Hari Kannan , Tom Morano , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: <200011032303.PAA07244@tomb.fsc-usa.com> <3A02F7E4.5BEF806D@alacritech.com> <20001104094622.A10698@gruyere.muc.suse.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Andi Kleen wrote: > > On Fri, Nov 03, 2000 at 09:37:40AM -0800, Matt D. Robinson wrote: > > BTW, Andi, did 2.4 change the scheduler now so that you don't have > > to get tasklist_lock to avoid having jobs scheduled underneath you > > (in a panic()/interrupt state)? > > The current CPU cannot be rescheduled because panic is hanging in kernel > mode. The panic also sends a stop IPI to the other CPUs, but until the > IPI is processed there may be some scheduling. The IPI send function > smp_call_function was also fixed to never schedule (it previously took > a semaphore which sometimes could lead to the panic thread calling schedule) > So it should be ok now. > > -Andi Cool. We figured it was broken behavior -- we'd get messed up stack pages for some dumps where scheduling took place. It was my original understanding that this wouldn't happen, but then, 2.2 has a number of broken issues. --Matt From owner-lkcd@oss.sgi.com Mon Nov 6 11:31:21 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 11:31:02 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:19980 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Mon, 6 Nov 2000 11:30:47 -0800 Received: from [10.1.1.194] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id angkaaaa for ; Mon, 6 Nov 2000 11:28:06 -0800 Message-ID: <3A06A618.9B7963@alacritech.com> Date: Mon, 06 Nov 2000 04:37:44 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: SWTRINITY@aol.com, lkcd@oss.sgi.com Subject: Re: Context switching disabled while dumping? References: <35.b6b4976.2718b1a3@aol.com> <39E7EB72.886ECBF2@alacritech.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing "Matt D. Robinson" wrote: > > SWTRINITY@aol.com wrote: > > > > I am using lkcd with a 2.2.16 kernel and > > I seem to be seeing a problem where context switching > > is occurring while the dump is ongoing. This causes > > processes to switch in and change the state. I saw this > > behavior with the automounter switching in > > and causing another kernel panic. Does > > context switching need to be disabled while > > the dump is ongoing and if so, how is this accomplished. > > > > Thanks > > Les. Just an FYI, Les, this is corrected in 2.2.16. I might be able to add in something to the 2.2.X version of LKCD to prevent scheduling by grabing the tasklist_lock. Let me know if this is something you need. --Matt From owner-lkcd@oss.sgi.com Mon Nov 6 11:44:11 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 11:44:01 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:59405 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Mon, 6 Nov 2000 11:43:42 -0800 Received: from [10.1.1.194] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id yngkaaaa for ; Mon, 6 Nov 2000 11:41:01 -0800 Message-ID: <3A06A91E.C40C4B5@alacritech.com> Date: Mon, 06 Nov 2000 04:50:38 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: SWTRINITY@aol.com, lkcd@oss.sgi.com Subject: Re: Context switching disabled while dumping? References: <35.b6b4976.2718b1a3@aol.com> <39E7EB72.886ECBF2@alacritech.com> <3A06A618.9B7963@alacritech.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing "Matt D. Robinson" wrote: > > "Matt D. Robinson" wrote: > > > > SWTRINITY@aol.com wrote: > > > > > > I am using lkcd with a 2.2.16 kernel and > > > I seem to be seeing a problem where context switching > > > is occurring while the dump is ongoing. This causes > > > processes to switch in and change the state. I saw this > > > behavior with the automounter switching in > > > and causing another kernel panic. Does > > > context switching need to be disabled while > > > the dump is ongoing and if so, how is this accomplished. > > > > > > Thanks > > > Les. > > Just an FYI, Les, this is corrected in 2.2.16. I might be Now that I've had my coffee, I meant 2.4.X, not 2.2.16. :) > able to add in something to the 2.2.X version of LKCD to > prevent scheduling by grabing the tasklist_lock. Let me know > if this is something you need. > > --Matt --Matt From owner-lkcd@oss.sgi.com Mon Nov 6 11:46:01 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 11:45:52 -0800 Received: from Cantor.suse.de ([194.112.123.193]:54280 "HELO Cantor.suse.de") by oss.sgi.com with SMTP id ; Mon, 6 Nov 2000 11:45:41 -0800 Received: from Hermes.suse.de (Hermes.suse.de [194.112.123.136]) by Cantor.suse.de (Postfix) with ESMTP id 9D8561E177; Mon, 6 Nov 2000 20:45:39 +0100 (MET) Received: from gruyere.muc.suse.de (unknown [10.23.1.2]) by Hermes.suse.de (Postfix) with ESMTP id 5FE643E482; Mon, 6 Nov 2000 20:45:39 +0100 (MET) Received: by gruyere.muc.suse.de (Postfix, from userid 14446) id EC13A2F300; Mon, 6 Nov 2000 20:45:37 +0100 (MET) Date: Mon, 6 Nov 2000 20:45:37 +0100 From: Andi Kleen To: "Matt D. Robinson" Cc: Andi Kleen , Hari Kannan , Tom Morano , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? Message-ID: <20001106204537.A26147@gruyere.muc.suse.de> References: <200011032303.PAA07244@tomb.fsc-usa.com> <3A02F7E4.5BEF806D@alacritech.com> <20001104094622.A10698@gruyere.muc.suse.de> <3A06A5D2.D08F2597@alacritech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3A06A5D2.D08F2597@alacritech.com>; from yakker@alacritech.com on Mon, Nov 06, 2000 at 04:36:34AM -0800 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Mon, Nov 06, 2000 at 04:36:34AM -0800, Matt D. Robinson wrote: > > Cool. We figured it was broken behavior -- we'd get messed up stack > pages for some dumps where scheduling took place. It was my > original understanding that this wouldn't happen, but then, 2.2 has > a number of broken issues. You should probably only call the kernel dumper after the stop IPI sending has finished, otherwise the other CPUs may still schedule in 2.4 -Andi From owner-lkcd@oss.sgi.com Mon Nov 6 12:17:02 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 12:16:52 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:31761 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Mon, 6 Nov 2000 12:16:47 -0800 Received: from [10.1.1.194] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id opgkaaaa for ; Mon, 6 Nov 2000 12:14:11 -0800 Message-ID: <3A06B0E3.A4481394@alacritech.com> Date: Mon, 06 Nov 2000 05:23:47 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: Andi Kleen CC: Hari Kannan , Tom Morano , hiren_mehta@agilent.com, lkcd@oss.sgi.com Subject: Re: how to make kernel do system dump ? References: <200011032303.PAA07244@tomb.fsc-usa.com> <3A02F7E4.5BEF806D@alacritech.com> <20001104094622.A10698@gruyere.muc.suse.de> <3A06A5D2.D08F2597@alacritech.com> <20001106204537.A26147@gruyere.muc.suse.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Andi Kleen wrote: > > On Mon, Nov 06, 2000 at 04:36:34AM -0800, Matt D. Robinson wrote: > > > > Cool. We figured it was broken behavior -- we'd get messed up stack > > pages for some dumps where scheduling took place. It was my > > original understanding that this wouldn't happen, but then, 2.2 has > > a number of broken issues. > > You should probably only call the kernel dumper after the stop IPI sending > has finished, otherwise the other CPUs may still schedule in 2.4 > > -Andi Calling smp_send_stop() isn't sufficient? I thought that did a disable_local_APIC() for each CPU (except the one we're running on), then executes the hlt instruction for each of those CPUs. There doesn't seem to be a routine to verify the apic_write_around() call has completed or not -- are you referring to something else? Just looking for clarity. --Matt From owner-lkcd@oss.sgi.com Mon Nov 6 12:26:22 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 12:26:12 -0800 Received: from Cantor.suse.de ([194.112.123.193]:49163 "HELO Cantor.suse.de") by oss.sgi.com with SMTP id ; Mon, 6 Nov 2000 12:26:09 -0800 Received: from Hermes.suse.de (Hermes.suse.de [194.112.123.136]) by Cantor.suse.de (Postfix) with ESMTP id BE4621E19A; Mon, 6 Nov 2000 21:26:07 +0100 (MET) Received: from gruyere.muc.suse.de (unknown [10.23.1.2]) by Hermes.suse.de (Postfix) with ESMTP id 7EFB73E483; Mon, 6 Nov 2000 21:26:07 +0100 (MET) Received: by gruyere.muc.suse.de (Postfix, from userid 14446) id CF2332F300; Mon, 6 Nov 2000 21:26:06 +0100 (MET) Date: Mon, 6 Nov 2000 21:26:06 +0100 From: Andi Kleen To: "Matt D. Robinson" Cc: Andi Kleen , Hari Kannan , Tom Morano , hiren_mehta@agilent.com, lkcd@oss.sgi.com, mingo@elte.hu Subject: Re: how to make kernel do system dump ? Message-ID: <20001106212606.A26634@gruyere.muc.suse.de> References: <200011032303.PAA07244@tomb.fsc-usa.com> <3A02F7E4.5BEF806D@alacritech.com> <20001104094622.A10698@gruyere.muc.suse.de> <3A06A5D2.D08F2597@alacritech.com> <20001106204537.A26147@gruyere.muc.suse.de> <3A06B0E3.A4481394@alacritech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3A06B0E3.A4481394@alacritech.com>; from yakker@alacritech.com on Mon, Nov 06, 2000 at 05:23:47AM -0800 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing [put Ingo Molnar as SMP maintainer into cc] On Mon, Nov 06, 2000 at 05:23:47AM -0800, Matt D. Robinson wrote: > Andi Kleen wrote: > > > > On Mon, Nov 06, 2000 at 04:36:34AM -0800, Matt D. Robinson wrote: > > > > > > Cool. We figured it was broken behavior -- we'd get messed up stack > > > pages for some dumps where scheduling took place. It was my > > > original understanding that this wouldn't happen, but then, 2.2 has > > > a number of broken issues. > > > > You should probably only call the kernel dumper after the stop IPI sending > > has finished, otherwise the other CPUs may still schedule in 2.4 > > > > -Andi > > Calling smp_send_stop() isn't sufficient? I thought that did a > disable_local_APIC() for each CPU (except the one we're running on), > then executes the hlt instruction for each of those CPUs. There doesn't > seem to be a routine to verify the apic_write_around() call has > completed or not -- are you referring to something else? That should be sufficient yes. But there seems to be a bug -- smp_send_stop() calls smp_call_function() in async mode which is likely wrong. I think this patch is needed (against 2.4.0test10), otherwise there is no guarantee that the other CPUs are really stopped. Ingo, what do you think ? --- arch/i386/kernel/smp.c-o Fri Oct 20 18:46:49 2000 +++ arch/i386/kernel/smp.c Mon Nov 6 21:27:38 2000 @@ -493,7 +493,7 @@ void smp_send_stop(void) { - smp_call_function(stop_this_cpu, NULL, 1, 0); + smp_call_function(stop_this_cpu, NULL, 0, 1); smp_num_cpus = 1; __cli(); -Andi From owner-lkcd@oss.sgi.com Mon Nov 6 13:15:52 2000 Received: by oss.sgi.com id ; Mon, 6 Nov 2000 13:15:42 -0800 Received: from msgbas1x.cos.agilent.com ([192.6.9.33]:8180 "HELO msgbas1.cos.agilent.com") by oss.sgi.com with SMTP id ; Mon, 6 Nov 2000 13:15:24 -0800 Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77]) by msgbas1.cos.agilent.com (Postfix) with ESMTP id C36D640C for ; Mon, 6 Nov 2000 14:15:23 -0700 (MST) Received: from axcsbh1.cos.agilent.com (axcsbh1.cos.agilent.com [130.29.152.143]) by msgrel1.cos.agilent.com (Postfix) with SMTP id 50C8423 for ; Mon, 6 Nov 2000 14:15:23 -0700 (MST) Received: from 130.29.152.143 by axcsbh1.cos.agilent.com (InterScan E-Mail VirusWall NT); Mon, 06 Nov 2000 14:15:23 -0700 (Mountain Standard Time) Received: by axcsbh1.cos.agilent.com with Internet Mail Service (5.5.2650.21) id ; Mon, 6 Nov 2000 14:15:23 -0700 Message-ID: From: hiren_mehta@agilent.com To: lkcd@oss.sgi.com Subject: dump problem while debugging scsi hba driver Date: Mon, 6 Nov 2000 14:15:20 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I am trying to debug a scsi hba driver (this driver is not for AIC7XXX) panic using lkcd. The dump device is on AIC7xxx. Also the /root /usr etc are on AIC7xxx. Now if the scsi hba driver panics, then can the linux dump to the dump device on aic7xxx ? -hiren From owner-lkcd@oss.sgi.com Tue Nov 7 09:48:21 2000 Received: by oss.sgi.com id ; Tue, 7 Nov 2000 09:48:11 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:64016 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Tue, 7 Nov 2000 09:48:01 -0800 Received: from [10.1.10.60] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id diikaaaa for ; Tue, 7 Nov 2000 09:45:21 -0800 Message-ID: <3A0841D3.F2EC411D@alacritech.com> Date: Tue, 07 Nov 2000 09:54:27 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.16 i686) X-Accept-Language: en MIME-Version: 1.0 To: hiren_mehta@agilent.com CC: lkcd@oss.sgi.com Subject: Re: dump problem while debugging scsi hba driver References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing hiren_mehta@agilent.com wrote: > > I am trying to debug a scsi hba driver (this driver is not > for AIC7XXX) panic using lkcd. The dump device is on AIC7xxx. > Also the /root /usr etc are on AIC7xxx. Now if the scsi hba driver > panics, then can the linux dump to the dump device on aic7xxx ? > > -hiren If the AIC7xxx driver panics, it's going to be hit and miss as to whether you get a dump image or not. The best solution is to go through some other disk driver (such as an IDE driver) to dump. This especially makes sense if you're debugging your stuff. Let me know more specifically what you're doing, and perhaps I can offer some more details as to what you might be seeing. With that said ... Okay, I'm going to use this as an opportunity to open up a discussion on this problem. I'd like to hear people's feedback on what should be the right direction for the future. It's important to hear back something on this ... Right now, as of 2.4, we end up calling brw_kiovec() as a mechanism for getting our pages out to disk. While this is great and all, it is hardly what I call "acceptable" for dumping purposes. The problem lies in a couple of areas. First, Linus has said that he doesn't want raw I/O for various reasons in the kernel. While kiobufs are a nice feature, they hardly come close to what I call "raw I/O", because they don't get around problems dealing with buffer head locks and device driver spinlocks. In addition, Linus has also said to me that we shouldn't be going through the standard IDE driver when we dump to disk, as he doesn't trust it (his words, not mine). I've dealt with this problem long enough, and it is excruciatingly annoying. So where does this leave us, in terms of future development? Here's what I propose, and I'd like to hear from those of you out there that have an interest in this area. * I'd like to see us create a separate set of generic disk drivers that specifically have the purpose of writing out raw to disk. Drivers for IDE and SCSI initially, and then any other driver we need after that. * These drivers can be used for the purpose of writing out raw to disk, with the assumption that anyone using them must understand they could be clobbering data if writing to a drive where buffered I/O is taking place (this should only happen due to coder error, where a user tries to use both to the same disk partition). The point is they are supposed to be reliable -- speed isn't a huge consideration up front. * I don't want to take the path of adding "features" to the current set of drivers, because A) they may not be maintained properly, B) they will be burdened down by other opinions as to what raw I/O really is, and C) we can't guarantee some type of locking won't be thrown into the mix. The complexities are probably: 1) Inserting a duplicate driver stream into the kernel; 2) Writing small enough yet complete enough drivers to perform basic raw I/O tasks (open, read, write, close) without locking; 3) Getting this accepted as a standard part of the kernel (yes, I know Linus is against a kernel debugger, but this isn't a kernel debugger, and despite how awesome 'lcrash' is, it's a crash dump analyzer, not a kernel debugger) ... LKCD _needs_ to be part of the kernel. To those of us that care about RAS initiatives, it isn't an option. And if not LKCD, then something like it. I'd typically recommend just putting in a 'if (dumping)' mechanism to do lock avoidance down through the driver level, but there isn't a real raw I/O driver to put that in, and the best solution I see is to make one. I've explored this, and I've written some stuff up, but I wanted to get people's thoughts first before I go running down one path and people think we should go down some other path. Andre Hedrick showed me some taskfile_wait() stuff that can do really low level raw I/O, but I'm not sure whether it's something we can use or not. Can I get people's thoughts, please? I don't ask for much. :) --Matt From owner-lkcd@oss.sgi.com Tue Nov 7 09:57:11 2000 Received: by oss.sgi.com id ; Tue, 7 Nov 2000 09:56:51 -0800 Received: from msgbas1tx.cos.agilent.com ([192.6.9.34]:59347 "HELO msgbas1t.cos.agilent.com") by oss.sgi.com with SMTP id ; Tue, 7 Nov 2000 09:56:35 -0800 Received: from msgrel1.cos.agilent.com (msgrel1.cos.agilent.com [130.29.152.77]) by msgbas1t.cos.agilent.com (Postfix) with ESMTP id A2A8033A; Tue, 7 Nov 2000 10:56:34 -0700 (MST) Received: from axcsbh4.cos.agilent.com (axcsbh4.cos.agilent.com [130.29.152.145]) by msgrel1.cos.agilent.com (Postfix) with SMTP id 49CD51F; Tue, 7 Nov 2000 10:56:34 -0700 (MST) Received: from 130.29.152.145 by axcsbh4.cos.agilent.com (InterScan E-Mail VirusWall NT); Tue, 07 Nov 2000 10:56:34 -0700 (Mountain Standard Time) Received: by axcsbh4.cos.agilent.com with Internet Mail Service (5.5.2650.21) id ; Tue, 7 Nov 2000 10:56:34 -0700 Message-ID: From: hiren_mehta@agilent.com To: yakker@alacritech.com Cc: lkcd@oss.sgi.com Subject: RE: dump problem while debugging scsi hba driver Date: Tue, 7 Nov 2000 10:56:31 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="ISO-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Well, As I mentioned, I am trying to debug the panic of my own scsi hba driver. The problem is that if my driver panics in the interrupt thread, then I am not sure whether linux will be able to do the dump. The reason is because at the time dump through aic7xxx, I am not sure whether aic7xx will be able to generate interrupts or not. Also, if my driver panics after it acquires the io_request_lock, then to do the I/O, aic7xxx will not be able to get the io_request_lock. In fact, I have seen many operating systems doing the dump by polling mechanism instead of interrupt mechanism. And my understanding is that aic7xxx driver does not work in polling mode. Thanks and regards, -hiren > -----Original Message----- > From: Matt D. Robinson [mailto:yakker@alacritech.com] > Sent: Tuesday, November 07, 2000 9:54 AM > To: hiren_mehta@agilent.com > Cc: lkcd@oss.sgi.com > Subject: Re: dump problem while debugging scsi hba driver > > > hiren_mehta@agilent.com wrote: > > > > I am trying to debug a scsi hba driver (this driver is not > > for AIC7XXX) panic using lkcd. The dump device is on AIC7xxx. > > Also the /root /usr etc are on AIC7xxx. Now if the scsi hba driver > > panics, then can the linux dump to the dump device on aic7xxx ? > > > > -hiren > > If the AIC7xxx driver panics, it's going to be hit and miss as to > whether you get a dump image or not. The best solution is to go > through some other disk driver (such as an IDE driver) to dump. > This especially makes sense if you're debugging your stuff. Let > me know more specifically what you're doing, and perhaps I can > offer some more details as to what you might be seeing. > > With that said ... > > Okay, I'm going to use this as an opportunity to open up a discussion > on this problem. I'd like to hear people's feedback on what should > be the right direction for the future. It's important to hear back > something on this ... > > Right now, as of 2.4, we end up calling brw_kiovec() as a mechanism > for getting our pages out to disk. While this is great and all, it > is hardly what I call "acceptable" for dumping purposes. > > The problem lies in a couple of areas. First, Linus has said that > he doesn't want raw I/O for various reasons in the kernel. While > kiobufs are a nice feature, they hardly come close to what I call > "raw I/O", because they don't get around problems dealing with > buffer head locks and device driver spinlocks. In addition, Linus > has also said to me that we shouldn't be going through the standard > IDE driver when we dump to disk, as he doesn't trust it (his words, > not mine). > > I've dealt with this problem long enough, and it is excruciatingly > annoying. So where does this leave us, in terms of future > development? > > Here's what I propose, and I'd like to hear from those of you > out there > that have an interest in this area. > > * I'd like to see us create a separate set of generic disk drivers > that specifically have the purpose of writing out raw to disk. > Drivers for IDE and SCSI initially, and then any other driver we > need after that. > > * These drivers can be used for the purpose of writing out raw to > disk, with the assumption that anyone using them must understand > they could be clobbering data if writing to a drive where buffered > I/O is taking place (this should only happen due to coder error, > where a user tries to use both to the same disk partition). The > point is they are supposed to be reliable -- speed isn't a huge > consideration up front. > > * I don't want to take the path of adding "features" to the current > set of drivers, because A) they may not be maintained properly, > B) they will be burdened down by other opinions as to what raw I/O > really is, and C) we can't guarantee some type of locking won't be > thrown into the mix. > > The complexities are probably: > > 1) Inserting a duplicate driver stream into the kernel; > 2) Writing small enough yet complete enough drivers to perform basic > raw I/O tasks (open, read, write, close) without locking; > 3) Getting this accepted as a standard part of the kernel > (yes, I know > Linus is against a kernel debugger, but this isn't a > kernel debugger, > and despite how awesome 'lcrash' is, it's a crash dump > analyzer, not > a kernel debugger) ... LKCD _needs_ to be part of the kernel. To > those of us that care about RAS initiatives, it isn't an option. > And if not LKCD, then something like it. > > I'd typically recommend just putting in a 'if (dumping)' mechanism to > do lock avoidance down through the driver level, but there > isn't a real > raw I/O driver to put that in, and the best solution I see is to make > one. I've explored this, and I've written some stuff up, but I wanted > to get people's thoughts first before I go running down one path and > people think we should go down some other path. Andre > Hedrick showed me > some taskfile_wait() stuff that can do really low level raw I/O, but > I'm not sure whether it's something we can use or not. > > Can I get people's thoughts, please? I don't ask for much. :) > > --Matt > From owner-lkcd@oss.sgi.com Wed Nov 8 14:31:51 2000 Received: by oss.sgi.com id ; Wed, 8 Nov 2000 14:31:41 -0800 Received: from web9708.mail.yahoo.com ([216.136.128.166]:32521 "HELO web9708.mail.yahoo.com") by oss.sgi.com with SMTP id ; Wed, 8 Nov 2000 14:31:33 -0800 Message-ID: <20001108223128.95778.qmail@web9708.mail.yahoo.com> Received: from [209.21.248.22] by web9708.mail.yahoo.com; Wed, 08 Nov 2000 14:31:28 PST Date: Wed, 8 Nov 2000 14:31:28 -0800 (PST) From: sanjay kumar Subject: Kernel Memory To: lkcd@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I am making some changes in the Linux Kernel 2.2.14, for my personal work. Here I have defined a data type (struct), whose size is 50 byte. Now I am taking an array (with my defined data type) size 500000. For this 500000 * 50 (around 25 MB memory) is required. I have made the kernel image, but it is not being loaded by LILO. If I am changing the array size to 500, all thing is working properly. Help me. __________________________________________________ Do You Yahoo!? Thousands of Stores. Millions of Products. All in one Place. http://shopping.yahoo.com/ From owner-lkcd@oss.sgi.com Wed Nov 8 16:11:31 2000 Received: by oss.sgi.com id ; Wed, 8 Nov 2000 16:11:22 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:346 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 8 Nov 2000 16:11:06 -0800 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id QAA22591 for ; Wed, 8 Nov 2000 16:03:14 -0800 (PST) mail_from (kaos@ocs.com.au) Received: from kao2.melbourne.sgi.com (kao2.melbourne.sgi.com [134.14.55.180]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA14110; Thu, 9 Nov 2000 11:08:25 +1100 X-Mailer: exmh version 2.1.1 10/15/1999 From: Keith Owens To: sanjay kumar cc: lkcd@oss.sgi.com Subject: Re: Kernel Memory In-reply-to: Your message of "Wed, 08 Nov 2000 14:31:28 -0800." <20001108223128.95778.qmail@web9708.mail.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 09 Nov 2000 11:08:25 +1100 Message-ID: <1858.973728505@kao2.melbourne.sgi.com> Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Wed, 8 Nov 2000 14:31:28 -0800 (PST), sanjay kumar wrote: > I am making some changes in the Linux Kernel 2.2.14, >for my personal work. Here I have defined a data type >(struct), whose size is 50 byte. Now I am taking an >array (with my defined data type) size 500000. For >this 500000 * 50 (around 25 MB memory) is required. I >have made the kernel image, but it is not being loaded >by LILO. If I am changing the array size to 500, all >thing is working properly. Help me. http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg06613.html and follow the thread. From owner-lkcd@oss.sgi.com Mon Nov 13 18:02:44 2000 Received: by oss.sgi.com id ; Mon, 13 Nov 2000 18:02:34 -0800 Received: from ewey-rwcmta.excite.com ([198.3.99.191]:17085 "EHLO ewey.excite.com") by oss.sgi.com with ESMTP id ; Mon, 13 Nov 2000 18:02:10 -0800 Received: from slippery ([199.172.153.106]) by ewey.excite.com (InterMail vM.4.01.02.39 201-229-119-122) with ESMTP id <20001114020200.EJWJ11416.ewey.excite.com@slippery> for ; Mon, 13 Nov 2000 18:02:00 -0800 Message-ID: <817169.974167320020.JavaMail.imail@slippery> Date: Mon, 13 Nov 2000 18:02:00 -0800 (PST) From: BijanM@excite.com To: lkcd@oss.sgi.com Subject: lkcd for 2.4 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Excite Inbox X-Sender-Ip: 64.14.25.32 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi, Is there a version of the lkcd available for 2.4? Thanks. --bijan _______________________________________________________ Tired of slow Internet? Get @Home Broadband Internet http://www.home.com/xinbox/signup.html From owner-lkcd@oss.sgi.com Tue Nov 14 15:06:01 2000 Received: by oss.sgi.com id ; Tue, 14 Nov 2000 15:05:51 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:28166 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Tue, 14 Nov 2000 15:05:44 -0800 Received: from [10.1.1.27] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id nxskaaaa for ; Tue, 14 Nov 2000 15:02:54 -0800 Message-ID: <3A11C6C3.35114049@alacritech.com> Date: Tue, 14 Nov 2000 15:12:03 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.17 i686) X-Accept-Language: en MIME-Version: 1.0 To: BijanM@excite.com, lkcd@oss.sgi.com Subject: Re: lkcd for 2.4 References: <817169.974167320020.JavaMail.imail@slippery> <3A10A554.EBF9AB4F@alacritech.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing "Matt D. Robinson" wrote: > > BijanM@excite.com wrote: > > > > Hi, > > > > Is there a version of the lkcd available for 2.4? > > > > Thanks. > > > > --bijan > > > > _______________________________________________________ > > Tired of slow Internet? Get @Home Broadband Internet > > http://www.home.com/xinbox/signup.html Yes, however, you have to pull it from the latest tree on sourceforge in order to use it. The location of the SourceForge web page is: https://sourceforge.net/projects/lkcd/ >From there you can use: cvs -d:pserver:anonymous@cvs.lkcd.sourceforge.net:/cvsroot/lkcd login cvs -z3 -d:pserver:anonymous@cvs.lkcd.sourceforge.net:/cvsroot/lkcd co . That should check out the entire tree. You'll want to build a patch based on the 2.4 tree, and if you want 'lcrash' and all those tools, you can 'cd lkcdutils ; ./configure ; make' to configure and make 'lcrash'. Let me know if you have problems with this. Just an FYI to the list, we're going to check in a spec file today and we should have a lkcdutils RPM that you can install on your system in conjunction with the LKCD patch to get crash dumps. That will ease the installation process. --Matt From owner-lkcd@oss.sgi.com Wed Nov 15 07:56:48 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 07:56:38 -0800 Received: from lazy.accessus.net ([209.145.148.14]:62990 "EHLO lazy.accessus.net") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 07:56:24 -0800 Received: from localhost (jay@localhost) by lazy.accessus.net (8.9.3/8.9.3) with ESMTP id JAA25463 for ; Wed, 15 Nov 2000 09:56:21 -0600 (CST) Date: Wed, 15 Nov 2000 09:56:20 -0600 (CST) From: Jay Weber To: lkcd@oss.sgi.com Subject: Whats the story with lkcdutils? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I've noticed the lkcdutils repository on sourceforge and am wondering if somebody could update me as to what the status of it is. It appears to seperate lcrash into the userland realm so that I don't have to build that as part of my kernel package and that sounds great for packaging purposes. Does it work with 2.2 based kernel (or at all at this point)? When compiling I'm getting: make[2]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash/cmds' gcc -o lcrash -static -rdynamic -L. -L/usr/src/linux -L./../libklib -L./../liballoc -L./../librl main.o util.o eval.o report.o stabs.o struct.o vmdump.o -lcmds -larch -lalloc -lrl -lklib -lncurses -lopcodes -lbfd -liberty -ldl ./libcmds.a(cmd_strace.o): In function `strace_cmd': cmd_strace.c:63: undefined reference to `ia64_find_trace' ./libarch.a(idis.o): In function `do_dis': idis.c:104: undefined reference to `print_insn_i386' collect2: ld returned 1 exit status make[1]: *** [lcrash] Error 1 make[1]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash' Looking at the 2.2 cvs tree, these print_insn_* funcs seem to exist. In lkcdutils they don't. Thanks. From owner-lkcd@oss.sgi.com Wed Nov 15 08:45:48 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 08:45:29 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:13128 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 08:45:20 -0800 Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id IAA16154 for ; Wed, 15 Nov 2000 08:37:28 -0800 (PST) mail_from (tjm@sgi.com) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id IAA32217; Wed, 15 Nov 2000 08:43:47 -0800 (PST) Message-ID: <3A12BD42.C4EDE841@sgi.com> Date: Wed, 15 Nov 2000 08:43:46 -0800 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Jay Weber CC: lkcd@oss.sgi.com, Tom Morano Subject: Re: Whats the story with lkcdutils? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi Jay, The lkcdutils repository does exactly as you suspect, namely it separates the lcrash utility from the kernel source tree. If you look in the 2.4 tree, you will notice that, although the cmd directory tree still exists, it contains no files (the directories will be going away soon). The lcrash built from lkcdutils will work for both 2.2 and 2.4 kernels (it's still architecture specific, however). It will even work with different sub releases (e.g. 2.2.16 versus 2.2.18). The way we accomplish this is to use stabs type information generated for a particular kernel build. Since there currently is no type information generated by default, we have added a mechanism for creating a new target as part of our LKCD kernel patch. The Kerntypes target is a dummy .o built with -gstabs. It gets installed into the /boot directory, along with vmlinux and System.map. By accessing the stabs type informaiton contained in this module, we avoided the necessity of directly including kernel header files, and were able to remove lcrash from the kernel source tree. We plan to release an rpm that installs the lcrash binary, system scripts, and kernel patches necessary for implementing LKCD. All of this is work in progress. Matt and I are working to get this done ASAP. Regarding the lkcdutils build errors you encountered, one of those is newly introduced (by me) and is the result of my getting lcrash working for the ia64 architecture. The other one is a side effect of having a more up-to-date binutils package on your system. I will be checking in a fix for both of these problems today. Thank you for your interest. Please let us know if there is any functionality you would like to see added to this tool. Tom Jay Weber wrote: > > I've noticed the lkcdutils repository on sourceforge and am wondering if > somebody could update me as to what the status of it is. It appears to > seperate lcrash into the userland realm so that I don't have to build that > as part of my kernel package and that sounds great for packaging purposes. > > Does it work with 2.2 based kernel (or at all at this point)? When > compiling I'm getting: > > make[2]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash/cmds' > gcc -o lcrash -static -rdynamic -L. -L/usr/src/linux -L./../libklib > -L./../liballoc -L./../librl main.o util.o eval.o report.o stabs.o > struct.o vmdump.o -lcmds -larch -lalloc -lrl -lklib -lncurses -lopcodes > -lbfd -liberty -ldl > ./libcmds.a(cmd_strace.o): In function `strace_cmd': > cmd_strace.c:63: undefined reference to `ia64_find_trace' > ./libarch.a(idis.o): In function `do_dis': > idis.c:104: undefined reference to `print_insn_i386' > collect2: ld returned 1 exit status > make[1]: *** [lcrash] Error 1 > make[1]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash' > > Looking at the 2.2 cvs tree, these print_insn_* funcs seem to exist. In > lkcdutils they don't. > > Thanks. From owner-lkcd@oss.sgi.com Wed Nov 15 09:36:09 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 09:35:49 -0800 Received: from lazy.accessus.net ([209.145.148.14]:60932 "EHLO lazy.accessus.net") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 09:35:28 -0800 Received: from localhost (jay@localhost) by lazy.accessus.net (8.9.3/8.9.3) with ESMTP id LAA16077; Wed, 15 Nov 2000 11:35:18 -0600 (CST) Date: Wed, 15 Nov 2000 11:35:17 -0600 (CST) From: Jay Weber To: Tom Morano cc: lkcd@oss.sgi.com Subject: Re: Whats the story with lkcdutils? In-Reply-To: <3A12BD42.C4EDE841@sgi.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Cool. Thanks for all the information. I'll just proceed with using the cmd dir in 2.2 for now then. I'm banging away at lkcd on ide disk here using the 1.1 beta. On Wed, 15 Nov 2000, Tom Morano wrote: > Hi Jay, > > The lkcdutils repository does exactly as you suspect, namely it > separates the lcrash utility from the kernel source tree. If you look > in the 2.4 tree, you will notice that, although the cmd directory tree > still exists, it contains no files (the directories will be going away > soon). > > The lcrash built from lkcdutils will work for both 2.2 and 2.4 kernels > (it's still architecture specific, however). It will even work with > different sub releases (e.g. 2.2.16 versus 2.2.18). The way we > accomplish this is to use stabs type information generated for a > particular kernel build. Since there currently is no type information > generated by default, we have added a mechanism for creating a new > target as part of our LKCD kernel patch. The Kerntypes target is a > dummy .o built with -gstabs. It gets installed into the /boot > directory, along with vmlinux and System.map. By accessing the stabs > type informaiton contained in this module, we avoided the necessity of > directly including kernel header files, and were able to remove lcrash > from the kernel source tree. We plan to release an rpm that installs > the lcrash binary, system scripts, and kernel patches necessary for > implementing LKCD. All of this is work in progress. Matt and I are > working to get this done ASAP. > > Regarding the lkcdutils build errors you encountered, one of those is > newly introduced (by me) and is the result of my getting lcrash > working for the ia64 architecture. The other one is a side effect of > having a more up-to-date binutils package on your system. I will be > checking in a fix for both of these problems today. > > Thank you for your interest. Please let us know if there is any > functionality you would like to see added to this tool. > > Tom > > > Jay Weber wrote: > > > > I've noticed the lkcdutils repository on sourceforge and am wondering if > > somebody could update me as to what the status of it is. It appears to > > seperate lcrash into the userland realm so that I don't have to build that > > as part of my kernel package and that sounds great for packaging purposes. > > > > Does it work with 2.2 based kernel (or at all at this point)? When > > compiling I'm getting: > > > > make[2]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash/cmds' > > gcc -o lcrash -static -rdynamic -L. -L/usr/src/linux -L./../libklib > > -L./../liballoc -L./../librl main.o util.o eval.o report.o stabs.o > > struct.o vmdump.o -lcmds -larch -lalloc -lrl -lklib -lncurses -lopcodes > > -lbfd -liberty -ldl > > ./libcmds.a(cmd_strace.o): In function `strace_cmd': > > cmd_strace.c:63: undefined reference to `ia64_find_trace' > > ./libarch.a(idis.o): In function `do_dis': > > idis.c:104: undefined reference to `print_insn_i386' > > collect2: ld returned 1 exit status > > make[1]: *** [lcrash] Error 1 > > make[1]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash' > > > > Looking at the 2.2 cvs tree, these print_insn_* funcs seem to exist. In > > lkcdutils they don't. > > > > Thanks. > From owner-lkcd@oss.sgi.com Wed Nov 15 10:19:39 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 10:19:29 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:10758 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 10:19:04 -0800 Received: from [10.1.1.27] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id jbukaaaa for ; Wed, 15 Nov 2000 10:16:05 -0800 Message-ID: <3A12D50B.8E738703@alacritech.com> Date: Wed, 15 Nov 2000 10:25:15 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.17 i686) X-Accept-Language: en MIME-Version: 1.0 To: Jay Weber CC: lkcd@oss.sgi.com Subject: Re: Whats the story with lkcdutils? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Jay Weber wrote: > > I've noticed the lkcdutils repository on sourceforge and am wondering if > somebody could update me as to what the status of it is. It appears to > seperate lcrash into the userland realm so that I don't have to build that > as part of my kernel package and that sounds great for packaging purposes. > > Does it work with 2.2 based kernel (or at all at this point)? When > compiling I'm getting: > > make[2]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash/cmds' > gcc -o lcrash -static -rdynamic -L. -L/usr/src/linux -L./../libklib > -L./../liballoc -L./../librl main.o util.o eval.o report.o stabs.o > struct.o vmdump.o -lcmds -larch -lalloc -lrl -lklib -lncurses -lopcodes > -lbfd -liberty -ldl > ./libcmds.a(cmd_strace.o): In function `strace_cmd': > cmd_strace.c:63: undefined reference to `ia64_find_trace' This is due to building IA64 for the last few weeks. I noticed this yesterday, and Tom's fixing it (or has already fixed it). > ./libarch.a(idis.o): In function `do_dis': > idis.c:104: undefined reference to `print_insn_i386' This is due to legacy functionality. The problem is the binutils you are using removed print_insn_i386 and replaced it with print_insn_i386_att and print_insn_i386_intel, and of course, removing direct access to the original function (dumb!). Tom is removing this function from lcrash, as it is historical and no longer needed (IMHO). BTW, anyone out there have a copy of (non-TurboLinux) Linux for Alpha? I've tried two different copies on one system, and none of them seem to work. I've got an Alpha system to do LKCD stuff on now ... --Matt > collect2: ld returned 1 exit status > make[1]: *** [lcrash] Error 1 > make[1]: Leaving directory `/usr/opt/build/BUILD/lkcdutils/lcrash' > > Looking at the 2.2 cvs tree, these print_insn_* funcs seem to exist. In > lkcdutils they don't. > > Thanks. From owner-lkcd@oss.sgi.com Wed Nov 15 11:48:48 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 11:48:38 -0800 Received: from lazy.accessus.net ([209.145.148.14]:53262 "EHLO lazy.accessus.net") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 11:48:33 -0800 Received: from localhost (jay@localhost) by lazy.accessus.net (8.9.3/8.9.3) with ESMTP id NAA31985; Wed, 15 Nov 2000 13:48:30 -0600 (CST) Date: Wed, 15 Nov 2000 13:48:29 -0600 (CST) From: Jay Weber To: "Matt D. Robinson" cc: lkcd@oss.sgi.com Subject: Re: Whats the story with lkcdutils? In-Reply-To: <3A12D50B.8E738703@alacritech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing > BTW, anyone out there have a copy of (non-TurboLinux) Linux for Alpha? > I've tried two different copies on one system, and none of them seem > to work. I've got an Alpha system to do LKCD stuff on now ... I can make you an alpha RedHat CD if you like and drop off at TT's. :) I believe Compaq has a CD available which is bootable also and don't require you todo all the SRPM/AlphaBIOS fiddling to get things working. Let me know.. I can download and burn either for you here at werk before I take off for the day. From owner-lkcd@oss.sgi.com Wed Nov 15 16:06:40 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 16:06:30 -0800 Received: from smtp.alacritech.com ([209.10.208.82]:41996 "EHLO smtp.alacritech.com") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 16:06:12 -0800 Received: from [10.1.1.27] by smtp.alacritech.com (NTMail 4.30.0012/NY3553.00.2884f51f) with ESMTP id fsukaaaa for ; Wed, 15 Nov 2000 16:03:21 -0800 Message-ID: <3A13266E.82EC869F@alacritech.com> Date: Wed, 15 Nov 2000 16:12:30 -0800 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.17 i686) X-Accept-Language: en MIME-Version: 1.0 To: Jay Weber CC: lkcd@oss.sgi.com Subject: Re: Whats the story with lkcdutils? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Jay Weber wrote: > > > BTW, anyone out there have a copy of (non-TurboLinux) Linux for Alpha? > > I've tried two different copies on one system, and none of them seem > > to work. I've got an Alpha system to do LKCD stuff on now ... > > I can make you an alpha RedHat CD if you like and drop off at TT's. :) > I believe Compaq has a CD available which is bootable also and don't > require you todo all the SRPM/AlphaBIOS fiddling to get things working. > > Let me know.. I can download and burn either for you here at werk before I > take off for the day. If you have both, that'd be awesome -- I'd like to try both for the system to see which one works best. Thanks, Jay. --Matt From owner-lkcd@oss.sgi.com Wed Nov 15 16:54:00 2000 Received: by oss.sgi.com id ; Wed, 15 Nov 2000 16:53:50 -0800 Received: from lazy.accessus.net ([209.145.148.14]:11277 "EHLO lazy.accessus.net") by oss.sgi.com with ESMTP id ; Wed, 15 Nov 2000 16:53:27 -0800 Received: from localhost (jay@localhost) by lazy.accessus.net (8.9.3/8.9.3) with ESMTP id SAA00811; Wed, 15 Nov 2000 18:53:24 -0600 (CST) Date: Wed, 15 Nov 2000 18:53:23 -0600 (CST) From: Jay Weber To: "Matt D. Robinson" cc: lkcd@oss.sgi.com Subject: Re: Whats the story with lkcdutils? In-Reply-To: <3A13266E.82EC869F@alacritech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Well, I think the Jumpstart just supports booting via CD and doing all the SRM setup stuff automagically for you and then installs your distribution of choice, redhat, debian or suse I believe were the listed supported ones. So, I'm about done burning both and will have them for you. On Wed, 15 Nov 2000, Matt D. Robinson wrote: > Jay Weber wrote: > > > > > BTW, anyone out there have a copy of (non-TurboLinux) Linux for Alpha? > > > I've tried two different copies on one system, and none of them seem > > > to work. I've got an Alpha system to do LKCD stuff on now ... > > > > I can make you an alpha RedHat CD if you like and drop off at TT's. :) > > I believe Compaq has a CD available which is bootable also and don't > > require you todo all the SRPM/AlphaBIOS fiddling to get things working. > > > > Let me know.. I can download and burn either for you here at werk before I > > take off for the day. > > If you have both, that'd be awesome -- I'd like to try both for > the system to see which one works best. > > Thanks, Jay. > > --Matt > From owner-lkcd@oss.sgi.com Thu Nov 30 22:35:57 2000 Received: by oss.sgi.com id ; Thu, 30 Nov 2000 22:35:48 -0800 Received: from [202.54.26.202] ([202.54.26.202]:34955 "EHLO hindon.hss.co.in") by oss.sgi.com with ESMTP id ; Thu, 30 Nov 2000 22:35:34 -0800 Received: from sandesh.hss.hns.com (localhost [127.0.0.1]) by hindon.hss.co.in (8.10.0/8.10.0) with SMTP id eB16aet23704; Fri, 1 Dec 2000 12:06:40 +0530 (IST) Received: by sandesh.hss.hns.com(Lotus SMTP MTA v4.6.3 (733.2 10-16-1998)) id 652569A8.002320D9 ; Fri, 1 Dec 2000 11:53:41 +0530 X-Lotus-FromDomain: HSS From: kabsingh@hss.hns.com To: yakker@sgi.com, tjm@sgi.com cc: lkcd@oss.sgi.com Message-ID: <652569A8.00231AB7.00@sandesh.hss.hns.com> Date: Fri, 1 Dec 2000 12:04:35 +0530 Subject: lcrash : General Layout of Stack Backtrace Functionality Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Sir/Mam, I have Red Hat Linux release 6.2 (Zoot) Kernel 2.2.13-12 on an i686. Is it possible for me ti use LKCD and lcrash in a situation where incase of a crash i want to analyse the stackframe of the functions leading to the crash. I read a document titled "Linux Kernel Crash Dumps" authored by you with the following information : "3.5.4.1 General Layout of Stack Backtrace Functionality The LCRASH stack trace facility has been organized much like the other architecture specific features. An architecture independent layer provides a set of functions for finding and displaying kernel stack backtraces. These functions make calls to platform specific functions, which perform the actual work. At this time, only support for i386 stack traces has been provided. " Please reply as soon as possible. Thanking You in anticipation, Regards, Kabir Singh (Software Engineer) Hughes Software Systems Electronic City, Plot 31, Sector 18 Gurgaon - 122 015 Haryana (INDIA) Tel 91-124-6346666,6343703 extn 2348 From owner-lkcd@oss.sgi.com Thu Nov 30 23:26:57 2000 Received: by oss.sgi.com id ; Thu, 30 Nov 2000 23:26:47 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:4645 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Thu, 30 Nov 2000 23:26:38 -0800 Received: from nodin.corp.sgi.com (fddi-nodin.corp.sgi.com [198.29.75.193]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id XAA21922 for ; Thu, 30 Nov 2000 23:26:32 -0800 (PST) mail_from (tjm@sgi.com) Received: from loco.csd.sgi.com (loco.csd.sgi.com [150.166.1.62]) by nodin.corp.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id XAA87613 for ; Thu, 30 Nov 2000 23:24:47 -0800 (PST) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id XAA07327; Thu, 30 Nov 2000 23:22:12 -0800 (PST) Message-ID: <3A2751A3.7C7ACFC8@sgi.com> Date: Thu, 30 Nov 2000 23:22:11 -0800 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: kabsingh@hss.hns.com CC: yakker@sgi.com, lkcd@oss.sgi.com, Tom Morano Subject: Re: lcrash : General Layout of Stack Backtrace Functionality References: <652569A8.00231AB7.00@sandesh.hss.hns.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing kabsingh@hss.hns.com wrote: > > Sir/Mam, > > I have Red Hat Linux release 6.2 (Zoot) Kernel 2.2.13-12 on an i686. > Is it possible for me ti use LKCD and lcrash in a situation where incase of a > crash i want > to analyse the stackframe of the functions leading to the crash. > I read a document titled "Linux Kernel Crash Dumps" authored by you > with the following information : > > "3.5.4.1 General Layout of Stack Backtrace Functionality > > The LCRASH stack trace facility has been organized much like the other > architecture specific features. > An architecture independent layer provides a set of functions for finding and > displaying kernel > stack backtraces. These functions make calls to platform specific functions, > which perform > the actual work. At this time, only support for i386 stack traces has been > provided. " > > Please reply as soon as possible. > Thanking You in anticipation, > Regards, Hi Kabir, There are two components to LKCD. One is a kernel component (in the form of a patch) which generates a system crash dump in the event of a system panic. The other is a utility called lcrash, which is able to read the contents of the dump and generate information about the state of the kernel at the time of the crash. Specific to your question, lcrash can generate backtrace information for all tasks running on the system at the time of the crash. Below is an example of the stack command output (without and with a stack frame dump): >> t c35bc000 ================================================================ STACK TRACE FOR TASK: 0xc35bc000(mingetty) 0 schedule+748 [0xc01142bc] 1 schedule_timeout+18 [0xc0113f3e] 2 read_chan+903 [0xc01956d7] 3 tty_read+178 [0xc0184d16] 4 sys_read+140 [0xc012c534] 5 system_call+44 [0xc0108dac] ================================================================ >> t -f c35bc000 ================================================================ STACK TRACE FOR TASK: 0xc35bc000(mingetty) 0 schedule+748 [0xc01142bc] RA=0xc0113f43, SP=0xc35bdea8, FP=0xc35bdefc, SIZE=88 c35bdea8: c35bdef8 c35bc000 03f10000 7fffffff c35bdeb8: c11d1840 c3cdc000 c3cdc000 c35bc000 c35bdec8: c11c71c0 c33b6000 c11c7d00 0000001a c35bded8: c02a0000 c11c7b20 c11c71c0 c35bc000 c35bdee8: 0000001d 00000000 c35bc000 c02d5540 c35bdef8: c35bdf1c c0113f43 1 schedule_timeout+18 [0xc0113f3e] RA=0xc01956dc, SP=0xc35bdf00, FP=0xc35bdf20, SIZE=36 c35bdf00: 00000008 c11d1840 00000000 00014000 c35bdf10: 00000246 c3cdc000 c11d1840 bffffe0c c35bdf20: c01956dc 2 read_chan+903 [0xc01956d7] RA=0xc0184d18, SP=0xc35bdf24, FP=0xc35bdf80, SIZE=96 c35bdf24: c3cdc000 c11d1840 c3213820 bffffe0c c35bdf34: 00000008 c35bc000 c3cdcbf4 c3cdc97c c35bdf44: c35bdf70 7fffffff 00000000 00000000 c35bdf54: 00000000 c35bc000 bffffe0b 01234567 c35bdf64: c35bc000 00000000 00000000 01234567 c35bdf74: c35bc000 c3cdc980 c3cdc980 c0184d18 3 tty_read+178 [0xc0184d16] RA=0xc012c536, SP=0xc35bdf84, FP=0xc35bdfa0, SIZE=32 c35bdf84: c3cdc000 c11d1840 bffffe0b 00000001 c35bdf94: ffffffea c11d1840 00000001 c012c536 4 sys_read+140 [0xc012c534] RA=0xc0108db3, SP=0xc35bdfa4, FP=0xc35bdfc0, SIZE=32 c35bdfa4: c11d1840 bffffe0b 00000001 c11d1860 c35bdfb4: c35bc000 0804ad80 bffffe0b c0108db3 5 system_call+44 [0xc0108dac] RA=0x400ba534, SP=0xc35bdfc4, FP=0xc35bdfec, SIZE=44 c35bdfc4: 00000000 bffffe0b 00000001 0804ad80 c35bdfd4: bffffe0b bffffe0c 00000003 0000002b c35bdfe4: 0000002b 00000003 400ba534 ================================================================ We currently support two architectures, i386 (which includes all x86 architectures) and ia64 (linux-2.4.x only). We are in the process of releasing a kernel patch for the 2.2.17 and 2.4.0-test9 kernels. The 2.2.17 patch, with minor conflict resolutions also works with the 2.2.16 kernel. We haven't tested it with earlier 2.2.x kernels. The approach we are taking now provides the most flexibility with regard to crash analysis tools. Our kernel patch creates a new target every time the kernel gets built. This new target, Kerntypes, gets copied to the /boot directory along with vmlinux and System.map. The lcrash utility uses the type information (which is in stabs form) in the Kerntypes object to reference specific pieces of kernel data, in conjunction with addresses from the symbol table. It should be fairly easy to get lcrash working with the earlier kernel. It might take more work for the kernel patch component. You might also want to look at some of our earlier LKCD releases, which have the lcrash utility more closely tied to the kernel source and build mechanism. In fact the FAQ in our web page (http://oss.sgi.com/projects/lkcd/faq.html) more closely matches the kernel you are using than the one we currently support (we are in the process of updating the docs to go along with the new release). Check out the LKCD 1.0.4 release at ftp://oss.sgi.com/projects/lkcd/download/OLD/1.x/1.0.4. Note that if you use this older release, your system must have SCSI disks. Our newest version supports IDE and SCSI. You may have to do some porting work (with the kernel patch and the utilities) if you want our current stuff to work on your older kernel. Hopefully this gives you enough information so that you can make your decision on the best way to proceed. Thanks for your interest, Tom > > Kabir Singh > (Software Engineer) > > Hughes Software Systems > Electronic City, > Plot 31, Sector 18 > Gurgaon - 122 015 > Haryana (INDIA) > > Tel 91-124-6346666,6343703 extn 2348