From owner-lkcd@oss.sgi.com Mon Sep 3 02:22:10 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f839MAq21266 for lkcd-outgoing; Mon, 3 Sep 2001 02:22:10 -0700 Received: from ausmtp02.au.ibm.com (ausmtp02.au.ibm.COM [202.135.136.105]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f839M4d21262 for ; Mon, 3 Sep 2001 02:22:04 -0700 Received: from f02n16e.au.ibm.com by ausmtp02.au.ibm.com (IBM AP 2.0) with ESMTP id f839JSo153952; Mon, 3 Sep 2001 19:19:29 +1000 Received: from d73mta01.au.ibm.com (f06n01s [9.185.166.65]) by f02n16e.au.ibm.com (8.11.1m3/NCO v4.97.1) with SMTP id f839LrB34114; Mon, 3 Sep 2001 19:21:53 +1000 Received: by d73mta01.au.ibm.com(Lotus SMTP MTA v4.6.5 (863.2 5-20-1999)) id CA256ABC.00337123 ; Mon, 3 Sep 2001 19:21:52 +1000 X-Lotus-FromDomain: IBMIN@IBMAU From: r1vamsi@in.ibm.com To: "Matt D. Robinson" cc: richard.schaal@intel.com, lkcd@oss.sgi.com Message-ID: Date: Mon, 3 Sep 2001 15:25:04 +0530 Subject: Re: LKCD + KDB ? Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk When both KDB and LKCD patches are applied, we drop into KDB on an oops. dump_execute will be called after we exit the debugger. If all you want is to disable dump taking after exiting debugger, that is easy enough with editing the dump_okay flag from within the debugger (or add a kdb command to do this) as Matt points out. Assuming there is a good reason for wanting to take the dump from within the debugger, one should add a simple dump command to kdb, which will just call dump_execute with proper regs. What you could do today is to set eip to dump_execute from with in the kernel, editing the stack to push correct params :-) (not as hard as it sounds, really) However, the cleaner approach obviously is to add the kdb dump command, once we understand a little better why exactly would one want to dump from within the debugger (on an oops). Regards.. Vamsi. Vamsi Krishna S. Linux Technology Center, IBM Software Lab, Bangalore. Ph: +91 80 5262355 Extn: 3959 Internet: r1vamsi@in.ibm.com Please respond to "Matt D. Robinson" To: richard.schaal@intel.com cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) Subject: Re: LKCD + KDB ? Richard Schaal wrote: > > > My question is this - I have been a fan of the kernel debugger for some > time, and have had a bit of difficulty > resolving how to configure both capabilities into my kernel. I guess > what I'd like to have happen is to > have the system enter the debugger on an oops, then have the option of > dumping the system from the debugger, or > to dump the system automatically after the debugger is exited. There's no great way to do this right now. If in kdb you can set the field of 'dump_okay' field to FALSE, then reset it after dropping back from the debugger state, that'd be fine. I guess we could also add in something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, and when dump_execute() gets called, dump_kdb is checked, and if set to TRUE, resets it to FALSE. Then add a kdb command that sets the field for you ... Would that work? --Matt > What is your thinking on this? Did I goof something up in applying the > patches for the two features? > > Thanks, > Richard > > -- > Richard.Schaal@intel.com Intel Corporation > Ph: (408)765-1579 Richard Schaal > Mail Stop SC12-308 > 3600 Juliette Lane > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Mon Sep 3 07:07:38 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f83E7c008761 for lkcd-outgoing; Mon, 3 Sep 2001 07:07:38 -0700 Received: from exg.allot.com (mail.allot.com [199.203.223.202]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f83E7Zd08742 for ; Mon, 3 Sep 2001 07:07:36 -0700 Received: from allot.com (FELIX [172.16.1.37]) by exg.allot.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id RBFMTW0N; Mon, 3 Sep 2001 17:13:42 +0200 Message-ID: <3B938EAD.9C8D4E92@allot.com> Date: Mon, 03 Sep 2001 17:07:41 +0300 From: Felix Radensky Organization: Allot Communications Ltd. X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.2.19c i686) X-Accept-Language: en MIME-Version: 1.0 To: lkcd@oss.sgi.com Subject: Using latest CVS sources Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hi, Can someone please explain how can I use the latest CVS sources with kernel 2.2.19. Thanks in advance. Felix. From owner-lkcd@oss.sgi.com Tue Sep 4 00:28:23 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f847SNT12709 for lkcd-outgoing; Tue, 4 Sep 2001 00:28:23 -0700 Received: from fgwmail7.fujitsu.co.jp (fgwmail7.fujitsu.co.jp [192.51.44.37]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f847SKd12706 for ; Tue, 4 Sep 2001 00:28:20 -0700 Received: from m5.gw.fujitsu.co.jp by fgwmail7.fujitsu.co.jp (8.9.3/3.7W-MX0108-Fujitsu Gateway) id QAA23899 for ; Tue, 4 Sep 2001 16:28:14 +0900 (JST) (envelope-from naomi@pst.fujitsu.com) From: naomi@pst.fujitsu.com Received: from naomi.aoi.pst.fujitsu.com by m5.gw.fujitsu.co.jp (8.9.3/3.7W-0108-Fujitsu Domain Master) id QAA31473 for ; Tue, 4 Sep 2001 16:28:13 +0900 (envelope-from naomi@pst.fujitsu.com) Received: from localhost (IDENT:naomi@localhost [127.0.0.1]) by naomi.aoi.pst.fujitsu.com (8.9.3/8.9.3) with ESMTP id QAA16409 for ; Tue, 4 Sep 2001 16:27:53 +0900 To: lkcd@oss.sgi.com Subject: lcrash sub-commands line completion X-Mailer: Mew version 1.92.4 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20010904162753R.naomi@pst.fujitsu.com> Date: Tue, 04 Sep 2001 16:27:53 +0900 X-Dispatcher: imput version 980905(IM100) Lines: 34 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hello. Recently, I think that lcrash should have "sub-commands line completion". Lcrash has many sub-commands. And almost sub-commands have parameters such as filename or symbol name which should be specified. The present lcrash cannot complete on sub-commands line. For this reason, we have to memorize sub-commands names and parameters exactly. It is very inconvenient. So I'll add completion capability to librl. I'm considering as follows. While editing sub-commands line, if TAB key is pressed, lcrash completes the line (or do something as bash does). Lcrash will complete on sub-commands names with behavior almost equivalent to bash. And I consider that parameters of sub-commands have different characteristic each other, I'll add the mechanism let you be able to make your own completion function. Using this mechanism, you can call the function that behaves as you want when TAB key is pressed. As the first phase, I will show the completion on sub-commands names by the middle of the month in September. And as the next phase, I will show the mechanism of sub-commands parameters completion with some sample source using it. Is anybody considering sub-commands line completion? Any comments and suggestions are welcomed. Naomi Haseo From owner-lkcd@oss.sgi.com Tue Sep 4 00:44:51 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f847ipV12960 for lkcd-outgoing; Tue, 4 Sep 2001 00:44:51 -0700 Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f847ikd12957 for ; Tue, 4 Sep 2001 00:44:46 -0700 Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23]) by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id JAA41950; Tue, 4 Sep 2001 09:44:32 +0200 Received: from d12ml004.de.ibm.com (d12ml004_cs0 [9.165.223.50]) by d12relay02.de.ibm.com (8.11.1m3/NCO v4.97.1) with ESMTP id f847hC4217940; Tue, 4 Sep 2001 09:43:12 +0200 Importance: Normal Subject: Re: lcrash sub-commands line completion To: naomi@pst.fujitsu.com Cc: lkcd@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.3 March 21, 2000 Message-ID: From: "Michael Holzheu" Date: Tue, 4 Sep 2001 09:40:53 +0200 X-MIMETrack: Serialize by Router on D12ML004/12/M/IBM(Release 5.0.8 |June 18, 2001) at 04/09/2001 09:39:03 MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Sender: owner-lkcd@oss.sgi.com Precedence: bulk Naomi, Great! Command completion is really a feature, which makes lcrash much more userfriendly! Michael ------------------------------------------------------------------------ Linux/390 Development Phone: +49-7031-16-2360, Bld 71032-06-109 Email: holzheu@de.ibm.com naomi@pst.fujitsu.com@oss.sgi.com on 09/04/2001 09:27:53 AM Please respond to naomi@pst.fujitsu.com Sent by: owner-lkcd@oss.sgi.com To: lkcd@oss.sgi.com cc: Subject: lcrash sub-commands line completion Hello. Recently, I think that lcrash should have "sub-commands line completion". Lcrash has many sub-commands. And almost sub-commands have parameters such as filename or symbol name which should be specified. The present lcrash cannot complete on sub-commands line. For this reason, we have to memorize sub-commands names and parameters exactly. It is very inconvenient. So I'll add completion capability to librl. I'm considering as follows. While editing sub-commands line, if TAB key is pressed, lcrash completes the line (or do something as bash does). Lcrash will complete on sub-commands names with behavior almost equivalent to bash. And I consider that parameters of sub-commands have different characteristic each other, I'll add the mechanism let you be able to make your own completion function. Using this mechanism, you can call the function that behaves as you want when TAB key is pressed. As the first phase, I will show the completion on sub-commands names by the middle of the month in September. And as the next phase, I will show the mechanism of sub-commands parameters completion with some sample source using it. Is anybody considering sub-commands line completion? Any comments and suggestions are welcomed. Naomi Haseo From owner-lkcd@oss.sgi.com Tue Sep 4 01:06:10 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f8486Ag13342 for lkcd-outgoing; Tue, 4 Sep 2001 01:06:10 -0700 Received: from fgwmail7.fujitsu.co.jp (fgwmail7.fujitsu.co.jp [192.51.44.37]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84867d13338 for ; Tue, 4 Sep 2001 01:06:07 -0700 Received: from m5.gw.fujitsu.co.jp by fgwmail7.fujitsu.co.jp (8.9.3/3.7W-MX0108-Fujitsu Gateway) id RAA03736 for ; Tue, 4 Sep 2001 17:06:01 +0900 (JST) (envelope-from m-kotani@pst.fujitsu.com) Received: from classic.aoi.pst.fujitsu.com by m5.gw.fujitsu.co.jp (8.9.3/3.7W-0108-Fujitsu Domain Master) id RAA14789 for ; Tue, 4 Sep 2001 17:06:00 +0900 (envelope-from m-kotani@pst.fujitsu.com) Received: from doll (doll.aoi.pst.fujitsu.com [172.23.72.214]) by classic.aoi.pst.fujitsu.com (8.9.3/8.9.3) with SMTP id RAA06417 for ; Tue, 4 Sep 2001 17:06:00 +0900 Message-ID: <006201c13518$869ce140$d64817ac@aoi.pst.fujitsu.com> From: "Masashige Kotani" To: Subject: multiple dump devices Date: Tue, 4 Sep 2001 17:06:39 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hello. Nowadays, I think that the reliability of memory dumping: extracting memory as much as possible will be improved. LKCD uses "only one dump device" in process of memory dump. When it does not have enough capacity for memory dump it is not able to be used by some reasons, memory dumping is failure. - When additional memory devices are attached, The capacity of the dump device must be increased. - To avoid failing memory dump by disk failure, want to add alternative dump devices. In these cases, I consider that LKCD have to be handle multiple dump devices to be useful in different environments. It can improve the following problems: - When it runs short of capacity in one dump device Dump data can be divided and written in two or more dump devices. - When the dump device is broken The LKCD can dump, If it can use at least one among dump devices. I think that such expansion is indispensable for enterprises use, what do you think? LKCD dumps to multiple dump devices with parallel I/O if possible and time of dumping can be decreased. but it is still under examination. --Masashige From owner-lkcd@oss.sgi.com Tue Sep 4 01:16:26 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f848GQj13505 for lkcd-outgoing; Tue, 4 Sep 2001 01:16:26 -0700 Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f848GMd13500 for ; Tue, 4 Sep 2001 01:16:22 -0700 Received: from alacritech.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id f5G641J02782; Fri, 15 Jun 2001 23:04:01 -0700 Message-ID: <3B948D4F.9D7257B1@alacritech.com> Date: Tue, 04 Sep 2001 01:14:07 -0700 From: "Matt D. Robinson" X-Mailer: Mozilla 4.75 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: naomi@pst.fujitsu.com CC: lkcd@oss.sgi.com Subject: Re: lcrash sub-commands line completion References: <20010904162753R.naomi@pst.fujitsu.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk This sounds like a great thing to add. I have no problems with it. Note that we used to have a readline capability, but we removed it due to some of the GPL/LGPL licensing conflicts. Please let me know if you complete this in the future. I'm still planning to roll a 4.0 release as soon as I talk to the IBM folks about the last code drop I gave them. For those who are working directly in the tree, you'll note we're now moving from 'vmdump' to 'dump' conventions, and hopefully all the future scripts will use this as well. Also, I spoke to someone at MCL, and we'll see how we can roll in mcore into the LKCD project in some capacity. Have at it, Naomi-san. :) --Matt naomi@pst.fujitsu.com wrote: > > Hello. > Recently, I think that lcrash should have "sub-commands line completion". > > Lcrash has many sub-commands. And almost sub-commands have parameters such as > filename or symbol name which should be specified. > The present lcrash cannot complete on sub-commands line. > For this reason, we have to memorize sub-commands names and parameters exactly. > It is very inconvenient. > So I'll add completion capability to librl. > > I'm considering as follows. > While editing sub-commands line, if TAB key is pressed, lcrash completes the > line (or do something as bash does). > Lcrash will complete on sub-commands names with behavior almost equivalent to > bash. > And I consider that parameters of sub-commands have different characteristic > each other, I'll add the mechanism let you be able to make your own completion > function. Using this mechanism, you can call the function that behaves as you > want when TAB key is pressed. > > As the first phase, I will show the completion on sub-commands names by the > middle of the month in September. > And as the next phase, I will show the mechanism of sub-commands parameters > completion with some sample source using it. > > Is anybody considering sub-commands line completion? > Any comments and suggestions are welcomed. > > Naomi Haseo From owner-lkcd@oss.sgi.com Tue Sep 4 06:53:51 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84Drpu21154 for lkcd-outgoing; Tue, 4 Sep 2001 06:53:51 -0700 Received: from baucis.sc.intel.com (ns3.intel.com [143.183.152.22]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84Drjd21140 for ; Tue, 4 Sep 2001 06:53:45 -0700 Received: from SMTP (fmsmsxvs03-1.fm.intel.com [132.233.42.203]) by baucis.sc.intel.com (8.9.1a+p1/8.9.1/d: relay.m4,v 1.41 2001/07/09 21:06:22 root Exp $) with SMTP id NAA07723; Tue, 4 Sep 2001 13:53:38 GMT Received: from fmsmsx26.fm.intel.com ([132.233.48.26]) by 132.233.48.203 (Norton AntiVirus for Internet Email Gateways 1.0) ; Tue, 04 Sep 2001 13:53:37 0000 (GMT) Received: by fmsmsx26.fm.intel.com with Internet Mail Service (5.5.2653.19) id ; Tue, 4 Sep 2001 06:53:37 -0700 Message-ID: <10C8636AE359D4119118009027AE99870CE2F95B@FMSMSX34> From: "Howell, David P" To: "'Masashige Kotani'" , lkcd@oss.sgi.com Subject: RE: multiple dump devices Date: Tue, 4 Sep 2001 06:53:33 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-2022-jp" Sender: owner-lkcd@oss.sgi.com Precedence: bulk We are working on a proposal for redundant dump device support that I plan to share in the next few weeks; I've got a prototype mostly working that can be contributed. Let me know how you are approaching this, I'll send details of what we are doing here later this week. Sounds like a good opportunity for collaboration on this. Regards, Dave Howell -----Original Message----- From: Masashige Kotani [mailto:m-kotani@pst.fujitsu.com] Sent: Tuesday, September 04, 2001 4:07 AM To: lkcd@oss.sgi.com Subject: multiple dump devices Hello. Nowadays, I think that the reliability of memory dumping: extracting memory as much as possible will be improved. LKCD uses "only one dump device" in process of memory dump. When it does not have enough capacity for memory dump it is not able to be used by some reasons, memory dumping is failure. - When additional memory devices are attached, The capacity of the dump device must be increased. - To avoid failing memory dump by disk failure, want to add alternative dump devices. In these cases, I consider that LKCD have to be handle multiple dump devices to be useful in different environments. It can improve the following problems: - When it runs short of capacity in one dump device Dump data can be divided and written in two or more dump devices. - When the dump device is broken The LKCD can dump, If it can use at least one among dump devices. I think that such expansion is indispensable for enterprises use, what do you think? LKCD dumps to multiple dump devices with parallel I/O if possible and time of dumping can be decreased. but it is still under examination. --Masashige From owner-lkcd@oss.sgi.com Tue Sep 4 08:21:52 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84FLqo24064 for lkcd-outgoing; Tue, 4 Sep 2001 08:21:52 -0700 Received: from socal.sandiegoca.ncr.com (tan7.ncr.com [192.127.94.7]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84FLjd24060 for ; Tue, 4 Sep 2001 08:21:45 -0700 Received: from eswssol002.elsegundoca.ncr.com (eswssol002 [141.206.1.4]) by socal.sandiegoca.ncr.com (8.9.3+Sun/8.9.2) with ESMTP id IAA11433; Tue, 4 Sep 2001 08:21:37 -0700 (PDT) Received: (from kim@localhost) by eswssol002.elsegundoca.ncr.com (8.9.3+Sun/8.9.2) id IAA18430; Tue, 4 Sep 2001 08:21:35 -0700 (PDT) Date: Tue, 4 Sep 2001 08:21:35 -0700 From: Moo Kim To: r1vamsi@in.ibm.com Cc: "Matt D. Robinson" , richard.schaal@intel.com, lkcd@oss.sgi.com Subject: Re: LKCD + KDB ? Message-ID: <20010904082135.A17366@mailbox.ElSegundoCA.NCR.COM> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from r1vamsi@in.ibm.com on Mon, Sep 03, 2001 at 03:25:04PM +0530 Sender: owner-lkcd@oss.sgi.com Precedence: bulk I agree that adding of dump (or sysdump) command to KDB would be very useful. When the node drops into KDB from an oops, one may not have time to examine the oops problem online (or this person may not be the developer, but test engineer) that one may choose (or being asked) to take a memory dump instead to analyze the problem problem later. Thanks, Moo Kim Moo.Kim@NCR.COM NCR Corporation On Mon, Sep 03, 2001 at 03:25:04PM +0530, r1vamsi@in.ibm.com wrote: > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > dump_execute will be called after we exit the debugger. > > If all you want is to disable dump taking after exiting debugger, that is > easy enough with editing the dump_okay flag from within the debugger (or > add a kdb command to do this) as Matt points out. Assuming there is a good > reason for wanting to take the dump from within the debugger, one should > add a simple dump command to kdb, which will just call dump_execute with > proper regs. What you could do today is to set eip to dump_execute from > with in the kernel, editing the stack to push correct params :-) (not as > hard as it sounds, really) > > However, the cleaner approach obviously is to add the kdb dump command, > once we understand a little better why exactly would one want to dump from > within the debugger (on an oops). > > Regards.. Vamsi. > > Vamsi Krishna S. > Linux Technology Center, > IBM Software Lab, Bangalore. > Ph: +91 80 5262355 Extn: 3959 > Internet: r1vamsi@in.ibm.com > > > Please respond to "Matt D. Robinson" > > To: richard.schaal@intel.com > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > Subject: Re: LKCD + KDB ? > > > > Richard Schaal wrote: > > > > > > My question is this - I have been a fan of the kernel debugger for some > > time, and have had a bit of difficulty > > resolving how to configure both capabilities into my kernel. I guess > > what I'd like to have happen is to > > have the system enter the debugger on an oops, then have the option of > > dumping the system from the debugger, or > > to dump the system automatically after the debugger is exited. > > There's no great way to do this right now. If in kdb you can set the > field of 'dump_okay' field to FALSE, then reset it after dropping back > from the debugger state, that'd be fine. I guess we could also add in > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > and when dump_execute() gets called, dump_kdb is checked, and if set > to TRUE, resets it to FALSE. Then add a kdb command that sets the > field for you ... > > Would that work? > > --Matt > > > What is your thinking on this? Did I goof something up in applying the > > patches for the two features? > > > > Thanks, > > Richard > > > > -- > > Richard.Schaal@intel.com Intel Corporation > > Ph: (408)765-1579 Richard Schaal > > Mail Stop SC12-308 > > 3600 Juliette Lane > > "I can type faster than I think!" Santa Clara, CA 95052 > From owner-lkcd@oss.sgi.com Tue Sep 4 09:13:27 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84GDR826197 for lkcd-outgoing; Tue, 4 Sep 2001 09:13:27 -0700 Received: from baucis.sc.intel.com (ns3.intel.com [143.183.152.22]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84GDLd26192 for ; Tue, 4 Sep 2001 09:13:21 -0700 Received: from SMTP (fmsmsxvs03-1.fm.intel.com [132.233.42.203]) by baucis.sc.intel.com (8.9.1a+p1/8.9.1/d: relay.m4,v 1.41 2001/07/09 21:06:22 root Exp $) with SMTP id QAA25092; Tue, 4 Sep 2001 16:13:13 GMT Received: from fmsmsx26.fm.intel.com ([132.233.48.26]) by 132.233.48.203 (Norton AntiVirus for Internet Email Gateways 1.0) ; Tue, 04 Sep 2001 16:13:11 0000 (GMT) Received: by fmsmsx26.fm.intel.com with Internet Mail Service (5.5.2653.19) id ; Tue, 4 Sep 2001 09:13:10 -0700 Message-ID: <68843F808BE5D311AC6100A0C9C5786648485A@fmsmsx50.fm.intel.com> From: "Schaal, Richard" To: "'r1vamsi@in.ibm.com'" , "Matt D. Robinson" Cc: "Schaal, Richard" , lkcd@oss.sgi.com Subject: RE: LKCD + KDB ? Date: Tue, 4 Sep 2001 09:13:05 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk I think it would be relatively simple to have the dump_init code register a dump system function with the kernel debugger so that you could dump the system on demand. Note that not all problems are Oops related, and that a hung system, or one that is grossly under performing would be useful to get a snapshot of the activity at that time. Manual entry to the debugger and manual dump would seem to be a useful thing. - System survivability after such a dump would be nice, but not a show stopper at this point. So far as the dumping or not after an oops and entering kdb, there is a differentiation as to the reason for entering the debugger - you might derive a dump/no dump directive from whether you enter the debugger by reason of breakpoint or oops? I used to work for Stratus Computer - at that time, a panic or oops would put us into the debugger, and if we were successful in patching up the problem, the system could resume execution. In Linux, after an oops, maybe a "nodump" command would be useful as well to disable the dumping that might normally occur. Regards, Richard -----Original Message----- From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] Sent: Monday, September 03, 2001 2:55 AM To: Matt D. Robinson Cc: richard.schaal@intel.com; lkcd@oss.sgi.com Subject: Re: LKCD + KDB ? When both KDB and LKCD patches are applied, we drop into KDB on an oops. dump_execute will be called after we exit the debugger. If all you want is to disable dump taking after exiting debugger, that is easy enough with editing the dump_okay flag from within the debugger (or add a kdb command to do this) as Matt points out. Assuming there is a good reason for wanting to take the dump from within the debugger, one should add a simple dump command to kdb, which will just call dump_execute with proper regs. What you could do today is to set eip to dump_execute from with in the kernel, editing the stack to push correct params :-) (not as hard as it sounds, really) However, the cleaner approach obviously is to add the kdb dump command, once we understand a little better why exactly would one want to dump from within the debugger (on an oops). Regards.. Vamsi. Vamsi Krishna S. Linux Technology Center, IBM Software Lab, Bangalore. Ph: +91 80 5262355 Extn: 3959 Internet: r1vamsi@in.ibm.com Please respond to "Matt D. Robinson" To: richard.schaal@intel.com cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) Subject: Re: LKCD + KDB ? Richard Schaal wrote: > > > My question is this - I have been a fan of the kernel debugger for some > time, and have had a bit of difficulty > resolving how to configure both capabilities into my kernel. I guess > what I'd like to have happen is to > have the system enter the debugger on an oops, then have the option of > dumping the system from the debugger, or > to dump the system automatically after the debugger is exited. There's no great way to do this right now. If in kdb you can set the field of 'dump_okay' field to FALSE, then reset it after dropping back from the debugger state, that'd be fine. I guess we could also add in something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, and when dump_execute() gets called, dump_kdb is checked, and if set to TRUE, resets it to FALSE. Then add a kdb command that sets the field for you ... Would that work? --Matt > What is your thinking on this? Did I goof something up in applying the > patches for the two features? > > Thanks, > Richard > > -- > Richard.Schaal@intel.com Intel Corporation > Ph: (408)765-1579 Richard Schaal > Mail Stop SC12-308 > 3600 Juliette Lane > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Tue Sep 4 16:27:18 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84NRIc03514 for lkcd-outgoing; Tue, 4 Sep 2001 16:27:18 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84NR8d03511 for ; Tue, 4 Sep 2001 16:27:08 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f84NMDO00430; Tue, 4 Sep 2001 16:22:13 -0700 Message-ID: <3B9563E8.9A432B7B@alacritech.com> Date: Tue, 04 Sep 2001 16:29:44 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: "Howell, David P" CC: "'Masashige Kotani'" , lkcd@oss.sgi.com Subject: Re: multiple dump devices References: <10C8636AE359D4119118009027AE99870CE2F95B@FMSMSX34> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk "Howell, David P" wrote: > > We are working on a proposal for redundant dump device support that I plan > to > share in the next few weeks; I've got a prototype mostly working that can be > > contributed. Let me know how you are approaching this, I'll send details of > what we are doing here later this week. Sounds like a good opportunity for > collaboration on this. > > Regards, > Dave Howell I'm really curious as to the proposal. Sounds like a good idea, the real question becomes, do you want to chain multiple dump devices with multiple dump mechanisms? Here's where I'm going with this. I just finished the code to allow people to install their own dump compression mechanisms (right now, it'll be RLE, I have to check in the GZIP compression module, and people can put in whatever one they want). Do you want to take the next step and let people have chains of dump mechanisms based on the dump condition? I realize multiple dump devices is good, but what if you could plug in your own dump method with it? Then that dump method could query the available dump devices configured. So you'd have: dump methods (one standard, but plug-and-play) dump devices (requires at least one, multiples allowed, maybe access lists for methods?) dump compressions (configurable, usable by some methods) Would this be the eventual goal? That way, everything is tunable to their own liking. I figured I'd ask, since if you're going to add in multiple dump devices, and we've gone to multiple compression types, you might as well go all the way and add dump methods as well. I don't know what the rest of the group thinks, but this could be very useful. I'd definitely like to get some feedback ... this is all doable, as long as the dump compression code is in 'lcrash', and the pages are dumped in a way that we can find the location in memory, this can work pretty sweet for everyone here. --Matt > -----Original Message----- > From: Masashige Kotani [mailto:m-kotani@pst.fujitsu.com] > Sent: Tuesday, September 04, 2001 4:07 AM > To: lkcd@oss.sgi.com > Subject: multiple dump devices > > Hello. > > Nowadays, I think that the reliability of memory dumping: extracting memory > as much as possible will be improved. > > LKCD uses "only one dump device" in process of memory dump. When it does not > have enough capacity for memory dump it is not able to be used by some > reasons, memory dumping is failure. > > - When additional memory devices are attached, The capacity of the dump > device must be increased. > - To avoid failing memory dump by disk failure, want to add alternative dump > devices. > In these cases, I consider that LKCD have to be handle multiple dump devices > to be useful in different environments. > > It can improve the following problems: > - When it runs short of capacity in one dump device > Dump data can be divided and written in two or more dump devices. > - When the dump device is broken > The LKCD can dump, If it can use at least one among dump devices. > > I think that such expansion is indispensable for enterprises use, what do > you think? > > LKCD dumps to multiple dump devices with parallel I/O if possible and time > of dumping can be decreased. but it is still under examination. > > --Masashige From owner-lkcd@oss.sgi.com Tue Sep 4 16:29:03 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84NT3h03554 for lkcd-outgoing; Tue, 4 Sep 2001 16:29:03 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84NT1d03551 for ; Tue, 4 Sep 2001 16:29:01 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f84NO3O00476; Tue, 4 Sep 2001 16:24:03 -0700 Message-ID: <3B956457.DDBBE9CF@alacritech.com> Date: Tue, 04 Sep 2001 16:31:35 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: Felix Radensky CC: lkcd@oss.sgi.com Subject: Re: Using latest CVS sources References: <3B938EAD.9C8D4E92@allot.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Felix Radensky wrote: > > Hi, > > Can someone please explain how can I use the latest CVS sources > with kernel 2.2.19. > > Thanks in advance. > > Felix. Hi, Felix. The latest 2.2 tree is a bit behind what we're currently doing, and I haven't tried applying some of this stuff to 2.2 as of yet. The last state I left the 2.2 tree in was to at least allow you to dump to IDE disks as well, and has the Kerntypes mechanism in place. Is there some feature you're looking for in particular? --Matt From owner-lkcd@oss.sgi.com Tue Sep 4 16:34:46 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84NYku03702 for lkcd-outgoing; Tue, 4 Sep 2001 16:34:46 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84NYdd03699 for ; Tue, 4 Sep 2001 16:34:39 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f84NThO00620; Tue, 4 Sep 2001 16:29:43 -0700 Message-ID: <3B9565AA.FA9C1B2D@alacritech.com> Date: Tue, 04 Sep 2001 16:37:14 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: "Schaal, Richard" CC: "'r1vamsi@in.ibm.com'" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? References: <68843F808BE5D311AC6100A0C9C5786648485A@fmsmsx50.fm.intel.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk "Schaal, Richard" wrote: > > I think it would be relatively simple to have the dump_init code register a > dump system > function with the kernel debugger so that you could dump the system on > demand. Note that > not all problems are Oops related, and that a hung system, or one that is > grossly under performing > would be useful to get a snapshot of the activity at that time. Manual > entry to the debugger > and manual dump would seem to be a useful thing. - System survivability > after such a dump would be > nice, but not a show stopper at this point. You should already be able to do this with dump_function_ptr in the latest code. This should be assigned to dump_execute (at least in the last check-in I made). So if you call that address, you'll get the dump function pointer. > So far as the dumping or not after an oops and entering kdb, there is a > differentiation as to the reason > for entering the debugger - you might derive a dump/no dump directive from > whether you enter the debugger > by reason of breakpoint or oops? I'm curious, how many people drop into kdb, and then want to take a dump? I'd think that this is very useful for developers, but not as useful for customers who want to crash and reboot. > I used to work for Stratus Computer - at that time, a panic or oops would > put us into the debugger, and if we > were successful in patching up the problem, the system could resume > execution. In Linux, after an oops, maybe > a "nodump" command would be useful as well to disable the dumping that might > normally occur. This is fine -- I think these are all reasonable extensions to KDB, and I can work with that developer if need be to make that happen. There's an easy solution, one way or another. --Matt > Regards, > Richard > > -----Original Message----- > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > Sent: Monday, September 03, 2001 2:55 AM > To: Matt D. Robinson > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > Subject: Re: LKCD + KDB ? > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > dump_execute will be called after we exit the debugger. > > If all you want is to disable dump taking after exiting debugger, that is > easy enough with editing the dump_okay flag from within the debugger (or > add a kdb command to do this) as Matt points out. Assuming there is a good > reason for wanting to take the dump from within the debugger, one should > add a simple dump command to kdb, which will just call dump_execute with > proper regs. What you could do today is to set eip to dump_execute from > with in the kernel, editing the stack to push correct params :-) (not as > hard as it sounds, really) > > However, the cleaner approach obviously is to add the kdb dump command, > once we understand a little better why exactly would one want to dump from > within the debugger (on an oops). > > Regards.. Vamsi. > > Vamsi Krishna S. > Linux Technology Center, > IBM Software Lab, Bangalore. > Ph: +91 80 5262355 Extn: 3959 > Internet: r1vamsi@in.ibm.com > > Please respond to "Matt D. Robinson" > > To: richard.schaal@intel.com > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > Subject: Re: LKCD + KDB ? > > Richard Schaal wrote: > > > > > > My question is this - I have been a fan of the kernel debugger for some > > time, and have had a bit of difficulty > > resolving how to configure both capabilities into my kernel. I guess > > what I'd like to have happen is to > > have the system enter the debugger on an oops, then have the option of > > dumping the system from the debugger, or > > to dump the system automatically after the debugger is exited. > > There's no great way to do this right now. If in kdb you can set the > field of 'dump_okay' field to FALSE, then reset it after dropping back > from the debugger state, that'd be fine. I guess we could also add in > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > and when dump_execute() gets called, dump_kdb is checked, and if set > to TRUE, resets it to FALSE. Then add a kdb command that sets the > field for you ... > > Would that work? > > --Matt > > > What is your thinking on this? Did I goof something up in applying the > > patches for the two features? > > > > Thanks, > > Richard > > > > -- > > Richard.Schaal@intel.com Intel Corporation > > Ph: (408)765-1579 Richard Schaal > > Mail Stop SC12-308 > > 3600 Juliette Lane > > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Tue Sep 4 16:48:23 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f84NmNE03878 for lkcd-outgoing; Tue, 4 Sep 2001 16:48:23 -0700 Received: from mail.ocs.com.au (ppp0.ocs.com.au [203.34.97.3]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f84NmKd03875 for ; Tue, 4 Sep 2001 16:48:20 -0700 Received: (qmail 620 invoked from network); 4 Sep 2001 23:48:17 -0000 Received: from ocs3.intra.ocs.com.au (192.168.255.3) by mail.ocs.com.au with SMTP; 4 Sep 2001 23:48:17 -0000 Received: by ocs3.intra.ocs.com.au (Postfix, from userid 16331) id C1DF630008C; Wed, 5 Sep 2001 09:47:36 +1000 (EST) Received: from ocs3.intra.ocs.com.au (localhost [127.0.0.1]) by ocs3.intra.ocs.com.au (Postfix) with ESMTP id B4F4C9E; Wed, 5 Sep 2001 09:47:36 +1000 (EST) X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 From: Keith Owens To: "Matt D. Robinson" Cc: "Schaal, Richard" , lkcd@oss.sgi.com Subject: Re: LKCD + KDB ? In-reply-to: Your message of "Tue, 04 Sep 2001 16:37:14 MST." <3B9565AA.FA9C1B2D@alacritech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 05 Sep 2001 09:47:31 +1000 Message-ID: <14389.999647251@ocs3.intra.ocs.com.au> Sender: owner-lkcd@oss.sgi.com Precedence: bulk On Tue, 04 Sep 2001 16:37:14 -0700, "Matt D. Robinson" wrote: >"Schaal, Richard" wrote: >> I used to work for Stratus Computer - at that time, a panic or oops would >> put us into the debugger, and if we >> were successful in patching up the problem, the system could resume >> execution. In Linux, after an oops, maybe >> a "nodump" command would be useful as well to disable the dumping that might >> normally occur. > >This is fine -- I think these are all reasonable extensions to KDB, and >I can work with that developer if need be to make that happen. There's >an easy solution, one way or another. No need to involve me. Any code can register its own kdb commands as long as it runs after kdb init. IOW, the nodump command can be part of lkcd, no changes to kdb required. Just wrap it in #ifdef CONFIG_KDB. From owner-lkcd@oss.sgi.com Tue Sep 4 17:04:16 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f8504GP04324 for lkcd-outgoing; Tue, 4 Sep 2001 17:04:16 -0700 Received: from thalia.fm.intel.com (fmfdns02.fm.intel.com [132.233.247.11]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f85044d04320 for ; Tue, 4 Sep 2001 17:04:04 -0700 Received: from fmsmsxvs040.fm.intel.com (fmsmsxv040-1.fm.intel.com [132.233.48.108]) by thalia.fm.intel.com (8.9.1a+p1/8.9.1/d: relay.m4,v 1.42 2001/09/04 16:24:19 root Exp $) with SMTP id AAA04098 for ; Wed, 5 Sep 2001 00:04:02 GMT Received: from fmsmsx17.intel.com ([132.233.48.17]) by fmsmsxvs040.fm.intel.com (NAVGW 2.5.1.6) with SMTP id M2001090417030616419 ; Tue, 04 Sep 2001 17:03:06 -0700 Received: by fmsmsx17.fm.intel.com with Internet Mail Service (5.5.2653.19) id ; Tue, 4 Sep 2001 17:03:05 -0700 Message-ID: <68843F808BE5D311AC6100A0C9C5786648485D@fmsmsx50.fm.intel.com> From: "Schaal, Richard" To: "'Matt D. Robinson'" , "Schaal, Richard" Cc: "'r1vamsi@in.ibm.com'" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: RE: LKCD + KDB ? Date: Tue, 4 Sep 2001 17:01:26 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hi Matt, When you refer to the "latest code", what is that? I don't see anything on source forge as released code, and the latest from the SGI site has patches up to linux-2.4.4 is that what you were referring to? Thanks, Richard -----Original Message----- From: Matt D. Robinson [mailto:yakker@alacritech.com] Sent: Tuesday, September 04, 2001 4:37 PM To: Schaal, Richard Cc: 'r1vamsi@in.ibm.com'; lkcd@oss.sgi.com; akale@users.sourceforge.net; kaos@ocs.com.au Subject: Re: LKCD + KDB ? "Schaal, Richard" wrote: > > I think it would be relatively simple to have the dump_init code register a > dump system > function with the kernel debugger so that you could dump the system on > demand. Note that > not all problems are Oops related, and that a hung system, or one that is > grossly under performing > would be useful to get a snapshot of the activity at that time. Manual > entry to the debugger > and manual dump would seem to be a useful thing. - System survivability > after such a dump would be > nice, but not a show stopper at this point. You should already be able to do this with dump_function_ptr in the latest code. This should be assigned to dump_execute (at least in the last check-in I made). So if you call that address, you'll get the dump function pointer. > So far as the dumping or not after an oops and entering kdb, there is a > differentiation as to the reason > for entering the debugger - you might derive a dump/no dump directive from > whether you enter the debugger > by reason of breakpoint or oops? I'm curious, how many people drop into kdb, and then want to take a dump? I'd think that this is very useful for developers, but not as useful for customers who want to crash and reboot. > I used to work for Stratus Computer - at that time, a panic or oops would > put us into the debugger, and if we > were successful in patching up the problem, the system could resume > execution. In Linux, after an oops, maybe > a "nodump" command would be useful as well to disable the dumping that might > normally occur. This is fine -- I think these are all reasonable extensions to KDB, and I can work with that developer if need be to make that happen. There's an easy solution, one way or another. --Matt > Regards, > Richard > > -----Original Message----- > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > Sent: Monday, September 03, 2001 2:55 AM > To: Matt D. Robinson > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > Subject: Re: LKCD + KDB ? > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > dump_execute will be called after we exit the debugger. > > If all you want is to disable dump taking after exiting debugger, that is > easy enough with editing the dump_okay flag from within the debugger (or > add a kdb command to do this) as Matt points out. Assuming there is a good > reason for wanting to take the dump from within the debugger, one should > add a simple dump command to kdb, which will just call dump_execute with > proper regs. What you could do today is to set eip to dump_execute from > with in the kernel, editing the stack to push correct params :-) (not as > hard as it sounds, really) > > However, the cleaner approach obviously is to add the kdb dump command, > once we understand a little better why exactly would one want to dump from > within the debugger (on an oops). > > Regards.. Vamsi. > > Vamsi Krishna S. > Linux Technology Center, > IBM Software Lab, Bangalore. > Ph: +91 80 5262355 Extn: 3959 > Internet: r1vamsi@in.ibm.com > > Please respond to "Matt D. Robinson" > > To: richard.schaal@intel.com > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > Subject: Re: LKCD + KDB ? > > Richard Schaal wrote: > > > > > > My question is this - I have been a fan of the kernel debugger for some > > time, and have had a bit of difficulty > > resolving how to configure both capabilities into my kernel. I guess > > what I'd like to have happen is to > > have the system enter the debugger on an oops, then have the option of > > dumping the system from the debugger, or > > to dump the system automatically after the debugger is exited. > > There's no great way to do this right now. If in kdb you can set the > field of 'dump_okay' field to FALSE, then reset it after dropping back > from the debugger state, that'd be fine. I guess we could also add in > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > and when dump_execute() gets called, dump_kdb is checked, and if set > to TRUE, resets it to FALSE. Then add a kdb command that sets the > field for you ... > > Would that work? > > --Matt > > > What is your thinking on this? Did I goof something up in applying the > > patches for the two features? > > > > Thanks, > > Richard > > > > -- > > Richard.Schaal@intel.com Intel Corporation > > Ph: (408)765-1579 Richard Schaal > > Mail Stop SC12-308 > > 3600 Juliette Lane > > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Tue Sep 4 17:09:39 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f8509d004558 for lkcd-outgoing; Tue, 4 Sep 2001 17:09:39 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f8509Ud04554 for ; Tue, 4 Sep 2001 17:09:30 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f8504KO01594; Tue, 4 Sep 2001 17:04:20 -0700 Message-ID: <3B956DC7.F5F3559F@alacritech.com> Date: Tue, 04 Sep 2001 17:11:51 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: "Schaal, Richard" CC: "'r1vamsi@in.ibm.com'" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? References: <68843F808BE5D311AC6100A0C9C5786648485D@fmsmsx50.fm.intel.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk "Schaal, Richard" wrote: > > Hi Matt, > When you refer to the "latest code", what is that? I don't see anything on > source forge as released code, and the > latest from the SGI site has patches up to linux-2.4.4 is that what you were > referring to? > > Thanks, > Richard The latest code is in the SourceForge tree ... look in 2.4/drivers/block/dump.c, and you'll see the restructuring changes. 'lcrash' has also changed a bit. I copied the LKCD group on my last check-in. If you didn't get a copy of it, let me know. It touched a bunch of files. I have to check in new scripts and a new dumpconfig utility next (and fix this bloody SMP problem now that I actually have an SMP system again to test against). --Matt > > -----Original Message----- > From: Matt D. Robinson [mailto:yakker@alacritech.com] > Sent: Tuesday, September 04, 2001 4:37 PM > To: Schaal, Richard > Cc: 'r1vamsi@in.ibm.com'; lkcd@oss.sgi.com; akale@users.sourceforge.net; > kaos@ocs.com.au > Subject: Re: LKCD + KDB ? > > "Schaal, Richard" wrote: > > > > I think it would be relatively simple to have the dump_init code register > a > > dump system > > function with the kernel debugger so that you could dump the system on > > demand. Note that > > not all problems are Oops related, and that a hung system, or one that is > > grossly under performing > > would be useful to get a snapshot of the activity at that time. Manual > > entry to the debugger > > and manual dump would seem to be a useful thing. - System survivability > > after such a dump would be > > nice, but not a show stopper at this point. > > You should already be able to do this with dump_function_ptr in the > latest code. This should be assigned to dump_execute (at least in > the last check-in I made). So if you call that address, you'll get > the dump function pointer. > > > So far as the dumping or not after an oops and entering kdb, there is a > > differentiation as to the reason > > for entering the debugger - you might derive a dump/no dump directive from > > whether you enter the debugger > > by reason of breakpoint or oops? > > I'm curious, how many people drop into kdb, and then want to take a dump? > I'd think that this is very useful for developers, but not as useful for > customers who want to crash and reboot. > > > I used to work for Stratus Computer - at that time, a panic or oops would > > put us into the debugger, and if we > > were successful in patching up the problem, the system could resume > > execution. In Linux, after an oops, maybe > > a "nodump" command would be useful as well to disable the dumping that > might > > normally occur. > > This is fine -- I think these are all reasonable extensions to KDB, and > I can work with that developer if need be to make that happen. There's > an easy solution, one way or another. > > --Matt > > > Regards, > > Richard > > > > -----Original Message----- > > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > > Sent: Monday, September 03, 2001 2:55 AM > > To: Matt D. Robinson > > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > > Subject: Re: LKCD + KDB ? > > > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > > dump_execute will be called after we exit the debugger. > > > > If all you want is to disable dump taking after exiting debugger, that is > > easy enough with editing the dump_okay flag from within the debugger (or > > add a kdb command to do this) as Matt points out. Assuming there is a good > > reason for wanting to take the dump from within the debugger, one should > > add a simple dump command to kdb, which will just call dump_execute with > > proper regs. What you could do today is to set eip to dump_execute from > > with in the kernel, editing the stack to push correct params :-) (not as > > hard as it sounds, really) > > > > However, the cleaner approach obviously is to add the kdb dump command, > > once we understand a little better why exactly would one want to dump from > > within the debugger (on an oops). > > > > Regards.. Vamsi. > > > > Vamsi Krishna S. > > Linux Technology Center, > > IBM Software Lab, Bangalore. > > Ph: +91 80 5262355 Extn: 3959 > > Internet: r1vamsi@in.ibm.com > > > > Please respond to "Matt D. Robinson" > > > > To: richard.schaal@intel.com > > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > > Subject: Re: LKCD + KDB ? > > > > Richard Schaal wrote: > > > > > > > > > My question is this - I have been a fan of the kernel debugger for some > > > time, and have had a bit of difficulty > > > resolving how to configure both capabilities into my kernel. I guess > > > what I'd like to have happen is to > > > have the system enter the debugger on an oops, then have the option of > > > dumping the system from the debugger, or > > > to dump the system automatically after the debugger is exited. > > > > There's no great way to do this right now. If in kdb you can set the > > field of 'dump_okay' field to FALSE, then reset it after dropping back > > from the debugger state, that'd be fine. I guess we could also add in > > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > > and when dump_execute() gets called, dump_kdb is checked, and if set > > to TRUE, resets it to FALSE. Then add a kdb command that sets the > > field for you ... > > > > Would that work? > > > > --Matt > > > > > What is your thinking on this? Did I goof something up in applying the > > > patches for the two features? > > > > > > Thanks, > > > Richard > > > > > > -- > > > Richard.Schaal@intel.com Intel Corporation > > > Ph: (408)765-1579 Richard Schaal > > > Mail Stop SC12-308 > > > 3600 Juliette Lane > > > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Tue Sep 4 22:28:10 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f855SAo13551 for lkcd-outgoing; Tue, 4 Sep 2001 22:28:10 -0700 Received: from smtp02.vsnl.net (smtp02.vsnl.net [203.197.12.8]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f855Rwd13548 for ; Tue, 4 Sep 2001 22:27:58 -0700 Received: from vsnl.net ([203.199.156.60]) by smtp02.vsnl.net (Netscape Messaging Server 4.15) with ESMTP id GJ69RN01.JCY; Wed, 5 Sep 2001 09:58:35 +0530 Message-ID: <3B959ED0.E33BA3BC@vsnl.net> Date: Wed, 05 Sep 2001 09:11:04 +0530 From: "Amit S. Kale" X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.4.6 i686) X-Accept-Language: en MIME-Version: 1.0 To: "Matt D. Robinson" CC: "Schaal, Richard" , "'r1vamsi@in.ibm.com'" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? References: <68843F808BE5D311AC6100A0C9C5786648485D@fmsmsx50.fm.intel.com> <3B956DC7.F5F3559F@alacritech.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hi Matt, I have faced several times the problem of crash dumps not being available in kgdb. Many a times I don't have time to debug a panic immediately, so I keep the machine inside the debugger. A crash dump will enable me to save a crash dump and continue testing. I can get back to the core dump later. Usually it's a good idea to save cores for all non-trivial problems once a product goes alpha. If a problem which is supposedly fixed resurfaces, it's very difficult to say whether it's the same problem in absence of a core dump. In ideal world, all problems should be fixed immediately and completely using a debugger and we wouldn't need crash dumps. I guess it's time to think about making kgdb understand lkcd interface. "Matt D. Robinson" wrote: > > "Schaal, Richard" wrote: > > > > Hi Matt, > > When you refer to the "latest code", what is that? I don't see anything on > > source forge as released code, and the > > latest from the SGI site has patches up to linux-2.4.4 is that what you were > > referring to? > > > > Thanks, > > Richard > > The latest code is in the SourceForge tree ... look in > 2.4/drivers/block/dump.c, > and you'll see the restructuring changes. 'lcrash' has also changed a bit. > I copied the LKCD group on my last check-in. If you didn't get a copy of it, > let me know. It touched a bunch of files. > > I have to check in new scripts and a new dumpconfig utility next (and fix > this bloody SMP problem now that I actually have an SMP system again to test > against). > > --Matt > > > > > -----Original Message----- > > From: Matt D. Robinson [mailto:yakker@alacritech.com] > > Sent: Tuesday, September 04, 2001 4:37 PM > > To: Schaal, Richard > > Cc: 'r1vamsi@in.ibm.com'; lkcd@oss.sgi.com; akale@users.sourceforge.net; > > kaos@ocs.com.au > > Subject: Re: LKCD + KDB ? > > > > "Schaal, Richard" wrote: > > > > > > I think it would be relatively simple to have the dump_init code register > > a > > > dump system > > > function with the kernel debugger so that you could dump the system on > > > demand. Note that > > > not all problems are Oops related, and that a hung system, or one that is > > > grossly under performing > > > would be useful to get a snapshot of the activity at that time. Manual > > > entry to the debugger > > > and manual dump would seem to be a useful thing. - System survivability > > > after such a dump would be > > > nice, but not a show stopper at this point. > > > > You should already be able to do this with dump_function_ptr in the > > latest code. This should be assigned to dump_execute (at least in > > the last check-in I made). So if you call that address, you'll get > > the dump function pointer. > > > > > So far as the dumping or not after an oops and entering kdb, there is a > > > differentiation as to the reason > > > for entering the debugger - you might derive a dump/no dump directive from > > > whether you enter the debugger > > > by reason of breakpoint or oops? > > > > I'm curious, how many people drop into kdb, and then want to take a dump? > > I'd think that this is very useful for developers, but not as useful for > > customers who want to crash and reboot. > > > > > I used to work for Stratus Computer - at that time, a panic or oops would > > > put us into the debugger, and if we > > > were successful in patching up the problem, the system could resume > > > execution. In Linux, after an oops, maybe > > > a "nodump" command would be useful as well to disable the dumping that > > might > > > normally occur. > > > > This is fine -- I think these are all reasonable extensions to KDB, and > > I can work with that developer if need be to make that happen. There's > > an easy solution, one way or another. > > > > --Matt > > > > > Regards, > > > Richard > > > > > > -----Original Message----- > > > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > > > Sent: Monday, September 03, 2001 2:55 AM > > > To: Matt D. Robinson > > > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > > > Subject: Re: LKCD + KDB ? > > > > > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > > > dump_execute will be called after we exit the debugger. > > > > > > If all you want is to disable dump taking after exiting debugger, that is > > > easy enough with editing the dump_okay flag from within the debugger (or > > > add a kdb command to do this) as Matt points out. Assuming there is a good > > > reason for wanting to take the dump from within the debugger, one should > > > add a simple dump command to kdb, which will just call dump_execute with > > > proper regs. What you could do today is to set eip to dump_execute from > > > with in the kernel, editing the stack to push correct params :-) (not as > > > hard as it sounds, really) > > > > > > However, the cleaner approach obviously is to add the kdb dump command, > > > once we understand a little better why exactly would one want to dump from > > > within the debugger (on an oops). > > > > > > Regards.. Vamsi. > > > > > > Vamsi Krishna S. > > > Linux Technology Center, > > > IBM Software Lab, Bangalore. > > > Ph: +91 80 5262355 Extn: 3959 > > > Internet: r1vamsi@in.ibm.com > > > > > > Please respond to "Matt D. Robinson" > > > > > > To: richard.schaal@intel.com > > > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > > > Subject: Re: LKCD + KDB ? > > > > > > Richard Schaal wrote: > > > > > > > > > > > > My question is this - I have been a fan of the kernel debugger for some > > > > time, and have had a bit of difficulty > > > > resolving how to configure both capabilities into my kernel. I guess > > > > what I'd like to have happen is to > > > > have the system enter the debugger on an oops, then have the option of > > > > dumping the system from the debugger, or > > > > to dump the system automatically after the debugger is exited. > > > > > > There's no great way to do this right now. If in kdb you can set the > > > field of 'dump_okay' field to FALSE, then reset it after dropping back > > > from the debugger state, that'd be fine. I guess we could also add in > > > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > > > and when dump_execute() gets called, dump_kdb is checked, and if set > > > to TRUE, resets it to FALSE. Then add a kdb command that sets the > > > field for you ... > > > > > > Would that work? > > > > > > --Matt > > > > > > > What is your thinking on this? Did I goof something up in applying the > > > > patches for the two features? > > > > > > > > Thanks, > > > > Richard > > > > > > > > -- > > > > Richard.Schaal@intel.com Intel Corporation > > > > Ph: (408)765-1579 Richard Schaal > > > > Mail Stop SC12-308 > > > > 3600 Juliette Lane > > > > "I can type faster than I think!" Santa Clara, CA 95052 -- Amit S. Kale Linux Consultant, Pune, India. (kgdb@vsnl.net) Linux kernel source level debugger http://kgdb.sourceforge.net/ Translation filesystem http://trfs.sourceforge.net/ From owner-lkcd@oss.sgi.com Tue Sep 4 22:49:29 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f855nT913939 for lkcd-outgoing; Tue, 4 Sep 2001 22:49:29 -0700 Received: from ausmtp01.au.ibm.com (ausmtp01.au.ibm.COM [202.135.136.97]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f855nLd13924 for ; Tue, 4 Sep 2001 22:49:21 -0700 Received: from f02n16e.au.ibm.com by ausmtp01.au.ibm.com (IBM AP 2.0) with ESMTP id f855jPT212220; Wed, 5 Sep 2001 15:45:25 +1000 Received: from d73mta01.au.ibm.com (f06n01s [9.185.166.65]) by f02n16e.au.ibm.com (8.11.1m3/NCO v4.97.1) with SMTP id f855mq571344; Wed, 5 Sep 2001 15:48:52 +1000 Received: by d73mta01.au.ibm.com(Lotus SMTP MTA v4.6.5 (863.2 5-20-1999)) id CA256ABE.001FEE76 ; Wed, 5 Sep 2001 15:48:46 +1000 X-Lotus-FromDomain: IBMIN@IBMAU From: r1vamsi@in.ibm.com To: Keith Owens cc: "Matt D. Robinson" , "Schaal, Richard" , lkcd@oss.sgi.com Message-ID: Date: Wed, 5 Sep 2001 11:12:59 +0530 Subject: Re: LKCD + KDB ? (link/init order) Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk Keith, Lkcd could very well register kdb command and do what ever. However, when lkcd is linked into the kernel (which is the case most of the time), how can it be sure that kdb is initialized before lkcd's init (where in it could call kdb_register()) ? Is there any other way to ensure correct ordering of init calls, besides linking the objects in the desired sequence in the Makefiles? Regards.. Vamsi. Vamsi Krishna S. Linux Technology Center, IBM Software Lab, Bangalore. Ph: +91 80 5262355 Extn: 3959 Internet: r1vamsi@in.ibm.com Keith Owens on 09/05/2001 05:17:31 AM Please respond to Keith Owens To: "Matt D. Robinson" cc: "Schaal, Richard" , lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) Subject: Re: LKCD + KDB ? On Tue, 04 Sep 2001 16:37:14 -0700, "Matt D. Robinson" wrote: >"Schaal, Richard" wrote: >> I used to work for Stratus Computer - at that time, a panic or oops would >> put us into the debugger, and if we >> were successful in patching up the problem, the system could resume >> execution. In Linux, after an oops, maybe >> a "nodump" command would be useful as well to disable the dumping that might >> normally occur. > >This is fine -- I think these are all reasonable extensions to KDB, and >I can work with that developer if need be to make that happen. There's >an easy solution, one way or another. No need to involve me. Any code can register its own kdb commands as long as it runs after kdb init. IOW, the nodump command can be part of lkcd, no changes to kdb required. Just wrap it in #ifdef CONFIG_KDB. From owner-lkcd@oss.sgi.com Tue Sep 4 23:20:43 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f856Kh314406 for lkcd-outgoing; Tue, 4 Sep 2001 23:20:43 -0700 Received: from ausmtp01.au.ibm.com (ausmtp01.au.ibm.COM [202.135.136.97]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f856Eid14341 for ; Tue, 4 Sep 2001 23:20:21 -0700 Received: from f02n16e.au.ibm.com by ausmtp01.au.ibm.com (IBM AP 2.0) with ESMTP id f8565KT127370; Wed, 5 Sep 2001 16:05:21 +1000 Received: from d73mta01.au.ibm.com (f06n01s [9.185.166.65]) by f02n16e.au.ibm.com (8.11.1m3/NCO v4.97.1) with SMTP id f8568k535402; Wed, 5 Sep 2001 16:08:46 +1000 Received: by d73mta01.au.ibm.com(Lotus SMTP MTA v4.6.5 (863.2 5-20-1999)) id CA256ABE.0021C384 ; Wed, 5 Sep 2001 16:08:47 +1000 X-Lotus-FromDomain: IBMIN@IBMAU From: r1vamsi@in.ibm.com To: "Matt D. Robinson" cc: "Schaal, Richard" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Message-ID: Date: Wed, 5 Sep 2001 11:24:31 +0530 Subject: Re: LKCD + KDB ? Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk Richard, I agree with you completely on the rationale for wanting to dump from kdb. In fact, one could choose to trigger a dump (after which the system will likely continue to run), not just from KDB, which requires manual intervention, but from other debugging tools such as the IBM Dynamic Probes, where this could be done automatically. We are building "non-disruptive" dumps capability into lkcd, which will let the system continue normal execution after the dump is taken. These features will probably find more use when dumps are used for debugging other problem situations like performace related problems besides oops/panics. Regards.. Vamsi. Vamsi Krishna S. Linux Technology Center, IBM Software Lab, Bangalore. Ph: +91 80 5262355 Extn: 3959 Internet: r1vamsi@in.ibm.com "Matt D. Robinson" on 09/05/2001 05:07:14 AM Please respond to "Matt D. Robinson" To: "Schaal, Richard" cc: S Vamsikrishna/India/IBM@IBMIN, lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? "Schaal, Richard" wrote: > > I think it would be relatively simple to have the dump_init code register a > dump system > function with the kernel debugger so that you could dump the system on > demand. Note that > not all problems are Oops related, and that a hung system, or one that is > grossly under performing > would be useful to get a snapshot of the activity at that time. Manual > entry to the debugger > and manual dump would seem to be a useful thing. - System survivability > after such a dump would be > nice, but not a show stopper at this point. You should already be able to do this with dump_function_ptr in the latest code. This should be assigned to dump_execute (at least in the last check-in I made). So if you call that address, you'll get the dump function pointer. > So far as the dumping or not after an oops and entering kdb, there is a > differentiation as to the reason > for entering the debugger - you might derive a dump/no dump directive from > whether you enter the debugger > by reason of breakpoint or oops? I'm curious, how many people drop into kdb, and then want to take a dump? I'd think that this is very useful for developers, but not as useful for customers who want to crash and reboot. > I used to work for Stratus Computer - at that time, a panic or oops would > put us into the debugger, and if we > were successful in patching up the problem, the system could resume > execution. In Linux, after an oops, maybe > a "nodump" command would be useful as well to disable the dumping that might > normally occur. This is fine -- I think these are all reasonable extensions to KDB, and I can work with that developer if need be to make that happen. There's an easy solution, one way or another. --Matt > Regards, > Richard > > -----Original Message----- > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > Sent: Monday, September 03, 2001 2:55 AM > To: Matt D. Robinson > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > Subject: Re: LKCD + KDB ? > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > dump_execute will be called after we exit the debugger. > > If all you want is to disable dump taking after exiting debugger, that is > easy enough with editing the dump_okay flag from within the debugger (or > add a kdb command to do this) as Matt points out. Assuming there is a good > reason for wanting to take the dump from within the debugger, one should > add a simple dump command to kdb, which will just call dump_execute with > proper regs. What you could do today is to set eip to dump_execute from > with in the kernel, editing the stack to push correct params :-) (not as > hard as it sounds, really) > > However, the cleaner approach obviously is to add the kdb dump command, > once we understand a little better why exactly would one want to dump from > within the debugger (on an oops). > > Regards.. Vamsi. > > Vamsi Krishna S. > Linux Technology Center, > IBM Software Lab, Bangalore. > Ph: +91 80 5262355 Extn: 3959 > Internet: r1vamsi@in.ibm.com > > Please respond to "Matt D. Robinson" > > To: richard.schaal@intel.com > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > Subject: Re: LKCD + KDB ? > > Richard Schaal wrote: > > > > > > My question is this - I have been a fan of the kernel debugger for some > > time, and have had a bit of difficulty > > resolving how to configure both capabilities into my kernel. I guess > > what I'd like to have happen is to > > have the system enter the debugger on an oops, then have the option of > > dumping the system from the debugger, or > > to dump the system automatically after the debugger is exited. > > There's no great way to do this right now. If in kdb you can set the > field of 'dump_okay' field to FALSE, then reset it after dropping back > from the debugger state, that'd be fine. I guess we could also add in > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > and when dump_execute() gets called, dump_kdb is checked, and if set > to TRUE, resets it to FALSE. Then add a kdb command that sets the > field for you ... > > Would that work? > > --Matt > > > What is your thinking on this? Did I goof something up in applying the > > patches for the two features? > > > > Thanks, > > Richard > > > > -- > > Richard.Schaal@intel.com Intel Corporation > > Ph: (408)765-1579 Richard Schaal > > Mail Stop SC12-308 > > 3600 Juliette Lane > > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Tue Sep 4 23:52:11 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f856qBV14802 for lkcd-outgoing; Tue, 4 Sep 2001 23:52:11 -0700 Received: from pneumatic-tube.sgi.com (pneumatic-tube.sgi.com [204.94.214.22]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f856q9d14799 for ; Tue, 4 Sep 2001 23:52:09 -0700 Received: from nodin.corp.sgi.com (nodin.corp.sgi.com [192.26.51.193]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id XAA09709 for ; Tue, 4 Sep 2001 23:50:33 -0700 (PDT) mail_from (kaos@ocs.com.au) Received: from kao2.melbourne.sgi.com (kao2.melbourne.sgi.com [134.14.55.180]) by nodin.corp.sgi.com (8.11.4/8.11.2/nodin-1.0) with ESMTP id f856p7F39410330; Tue, 4 Sep 2001 23:51:07 -0700 (PDT) Received: by kao2.melbourne.sgi.com (Postfix, from userid 16331) id 81393300095; Wed, 5 Sep 2001 16:50:20 +1000 (EST) Received: from kao2.melbourne.sgi.com (localhost [127.0.0.1]) by kao2.melbourne.sgi.com (Postfix) with ESMTP id E904AA6; Wed, 5 Sep 2001 16:50:20 +1000 (EST) X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 From: Keith Owens To: r1vamsi@in.ibm.com Cc: lkcd@oss.sgi.com Subject: Re: LKCD + KDB ? (link/init order) In-reply-to: Your message of "Wed, 05 Sep 2001 11:12:59 +0530." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 05 Sep 2001 16:50:14 +1000 Message-ID: <28201.999672614@kao2.melbourne.sgi.com> Sender: owner-lkcd@oss.sgi.com Precedence: bulk On Wed, 5 Sep 2001 11:12:59 +0530, r1vamsi@in.ibm.com wrote: >Lkcd could very well register kdb command and do what ever. However, when >lkcd is linked into the kernel (which is the case most of the time), how >can it be sure that kdb is initialized before lkcd's init (where in it >could call kdb_register()) ? kdb is initialized just after mem_init(), in init/main.c::start_kernel(). If lkcd is called from start_kernel() then call it after kdb. If lkcd uses __initcall then it is initialized long after kdb has started. >Is there any other way to ensure correct ordering of init calls, besides >linking the objects in the desired sequence in the Makefiles? Either hand code the call sequence in start_kernel() or use __initcall and control the init order using the link order in the makefiles. Those are the only two choices. From owner-lkcd@oss.sgi.com Wed Sep 5 06:21:41 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f85DLfp21614 for lkcd-outgoing; Wed, 5 Sep 2001 06:21:41 -0700 Received: from ausmtp02.au.ibm.com (ausmtp02.au.ibm.COM [202.135.136.105]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f85DLZd21611 for ; Wed, 5 Sep 2001 06:21:36 -0700 Received: from f02n16e.au.ibm.com by ausmtp02.au.ibm.com (IBM AP 2.0) with ESMTP id f85DJ0o73274 for ; Wed, 5 Sep 2001 23:19:00 +1000 Received: from d73mta01.au.ibm.com (f06n01s [9.185.166.65]) by f02n16e.au.ibm.com (8.11.1m3/NCO v4.97.1) with SMTP id f85DLN930066 for ; Wed, 5 Sep 2001 23:21:23 +1000 Received: by d73mta01.au.ibm.com(Lotus SMTP MTA v4.6.5 (863.2 5-20-1999)) id CA256ABE.00495DC8 ; Wed, 5 Sep 2001 23:21:21 +1000 X-Lotus-FromDomain: IBMIN@IBMAU From: bsuparna@in.ibm.com To: "Matt D. Robinson" cc: lkcd@oss.sgi.com Message-ID: Date: Wed, 5 Sep 2001 18:39:09 +0530 Subject: Latest lkcd code and planned changes Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk Matt, Responding to two of your notes (about the last lkcd code drop) in one shot. >I'm still planning to roll a 4.0 release as soon as I talk to >the IBM folks about the last code drop I gave them. We've tried this code out only a UP, and have just started trying it on a SMP system. We had a few initial hiccups with dump configuration with the original scripts, but using test.c and modifying the device number as you suggested did the trick. As you've mentioned below new scripts and a new dumpconfig utility would be required for the 4.0 release. Before moving to SMP, we decided to first merge in our changes to enable system continuation after a dump, by making the other CPUs spin for the duration of the dump and then release them, rather than making them stop. (We are now using dprobes to trigger the dump from a probe point to test our changes.) I'm hoping that we can include this in the 4.0 release together with SMP problem fixes that you are working on. When are you planning on the release ? The next thing that we are trying to implement is to get non-disruptive dumps to work from any context, including interrupt context, based on some of the ideas we'd discussed earlier. We are attempting to get this to work with the current basic dump i/o model for the non-disruptive dumps case. (We may need to relook at it later once the dump driver interface is in place, though only for devices that implement/register such an interface) Will discuss this in more detail after we've tried out a few things ... >For those who are working directly in the tree, you'll note we're >now moving from 'vmdump' to 'dump' conventions, and hopefully all >the future scripts will use this as well. BTW, I did try directly accessing the CVS tree, which works. >Also, I spoke to someone at MCL, and we'll see how we can roll in >mcore into the LKCD project in some capacity. That's good news! We wanted to check with you on this. Do we now have a contact at MCL whom we can work with to do this, so that we have a fallback standalone dump feature ? >The latest code is in the SourceForge tree ... look in >2.4/drivers/block/dump.c, >and you'll see the restructuring changes. 'lcrash' has also >changed a bit.I copied the LKCD group on my last check-in. >If you didn't get a copy of it,let me know. It touched a bunch >of files. >I have to check in new scripts and a new dumpconfig utility next >(and fix this bloody SMP problem now that I actually have an SMP >system again to test against). Do let us know how this goes. We had to give some thought to a few of the SMP issues for the non-disruptive case (not that we're sure if we've got it right or thought of all subtle race possibilities ! ), so it would be interesting to discuss this more (I remember you mentioned fixing the CPU 0 special cases when we talked last). Regards Suparna Suparna Bhattacharya IBM Software Lab, India E-mail : bsuparna@in.ibm.com Phone : 91-80-5267117, Extn : 3961 From owner-lkcd@oss.sgi.com Wed Sep 5 10:38:53 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f85Hcrk27489 for lkcd-outgoing; Wed, 5 Sep 2001 10:38:53 -0700 Received: from exg.allot.com (mail.allot.com [199.203.223.202]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f85Hcld27486 for ; Wed, 5 Sep 2001 10:38:48 -0700 Received: from allot.com (FELIX [172.16.1.37]) by exg.allot.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id SKQVL6G1; Wed, 5 Sep 2001 20:44:58 +0200 Message-ID: <3B966319.56C10FE1@allot.com> Date: Wed, 05 Sep 2001 20:38:33 +0300 From: Felix Radensky Organization: Allot Communications Ltd. X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.2.19c i686) X-Accept-Language: en MIME-Version: 1.0 To: "Matt D. Robinson" CC: lkcd@oss.sgi.com Subject: Re: Using latest CVS sources References: <3B938EAD.9C8D4E92@allot.com> <3B956457.DDBBE9CF@alacritech.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hi, Matt I'm mostly looking for a more reliable dumps. I used version 3.1.3 with kernel 2.2.18 and noticed that dumps are not always created. E.g. if crash occurred in net_bh, the system just hanged and no dump was created. On the other hand, crash which occured at module loading stage, was dumped successfully. I was hoping that latest CVS code will allow more reliable dump creation in all contexts. Thanks. Felix. "Matt D. Robinson" wrote: > Felix Radensky wrote: > > > > Hi, > > > > Can someone please explain how can I use the latest CVS sources > > with kernel 2.2.19. > > > > Thanks in advance. > > > > Felix. > > Hi, Felix. The latest 2.2 tree is a bit behind what we're currently > doing, and I haven't tried applying some of this stuff to 2.2 as of > yet. The last state I left the 2.2 tree in was to at least allow you > to dump to IDE disks as well, and has the Kerntypes mechanism in place. > > Is there some feature you're looking for in particular? > > --Matt From owner-lkcd@oss.sgi.com Wed Sep 5 12:32:40 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f85JWew30014 for lkcd-outgoing; Wed, 5 Sep 2001 12:32:40 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f85JW5d30007 for ; Wed, 5 Sep 2001 12:32:06 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f85JQvO28172; Wed, 5 Sep 2001 12:26:57 -0700 Message-ID: <3B967E47.32BAE3D9@alacritech.com> Date: Wed, 05 Sep 2001 12:34:31 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: bsuparna@in.ibm.com CC: lkcd@oss.sgi.com Subject: Re: Latest lkcd code and planned changes References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk bsuparna@in.ibm.com wrote: > > Matt, > > Responding to two of your notes (about the last lkcd code drop) in one > shot. > > >I'm still planning to roll a 4.0 release as soon as I talk to > >the IBM folks about the last code drop I gave them. > > We've tried this code out only a UP, and have just started trying it on a > SMP system. We had a few initial hiccups with dump configuration with the > original scripts, but using test.c and modifying the device number as you > suggested did the trick. As you've mentioned below new scripts and a new > dumpconfig utility would be required for the 4.0 release. The dump configuration utility is checked in. It's in lkcdutils/lkcd_config. All the appropriate scripts/spec files have changed to use it. There's even a manual page, if you can believe that. > Before moving to SMP, we decided to first merge in our changes to enable > system continuation after a dump, by making the other CPUs spin for the > duration of the dump and then release them, rather than making them stop. > (We are now using dprobes to trigger the dump from a probe point to test > our changes.) How's this working? I'd like to get this into the tree if at all possible so we can get rid of the current "stop" method and get rid of the SMP bugs. > I'm hoping that we can include this in the 4.0 release together with SMP > problem fixes that you are working on. When are you planning on the release > ? I'm ready to release it now, believe it or not. I don't have to release the dump_gzip.c code just yet, as I'm still improving it, but at least everything will work with the new methodology. > The next thing that we are trying to implement is to get non-disruptive > dumps to work from any context, including interrupt context, based on some > of the ideas we'd discussed earlier. We are attempting to get this to work > with the current basic dump i/o model for the non-disruptive dumps case. > (We may need to relook at it later once the dump driver interface is in > place, though only for devices that implement/register such an interface) > Will discuss this in more detail after we've tried out a few things ... Okay. Hey, I was thinking. Right now, we open up /dev/dump (227,0) to do our ioctl()s against. If we make that our major number by default, we could have multiple dump instantiations in the kernel by working against the minor number. Would that work for you, David, or how were you planning to do this? > >For those who are working directly in the tree, you'll note we're > >now moving from 'vmdump' to 'dump' conventions, and hopefully all > >the future scripts will use this as well. > > BTW, I did try directly accessing the CVS tree, which works. Great. > >Also, I spoke to someone at MCL, and we'll see how we can roll in > >mcore into the LKCD project in some capacity. > > That's good news! We wanted to check with you on this. Do we now have a > contact at MCL whom we can work with to do this, so that we have a fallback > standalone dump feature ? I believe so. I've just started communicating with Mike Keefe. He's sent me a patch (among other things), and I'm in the process of review and seeing how we can integrate it, and then mcore. > >The latest code is in the SourceForge tree ... look in > >2.4/drivers/block/dump.c, > >and you'll see the restructuring changes. 'lcrash' has also > >changed a bit.I copied the LKCD group on my last check-in. > >If you didn't get a copy of it,let me know. It touched a bunch > >of files. > >I have to check in new scripts and a new dumpconfig utility next > >(and fix this bloody SMP problem now that I actually have an SMP > >system again to test against). > > Do let us know how this goes. We had to give some thought to a few of the > SMP issues for the non-disruptive case (not that we're sure if we've got it > right or thought of all subtle race possibilities ! ), so it would be > interesting to discuss this more (I remember you mentioned fixing the CPU 0 > special cases when we talked last). I've checked in almost everything you can imagine now: - lkcd_config - new /sbin/lkcd (instead of /sbin/vmdump) - modifications to rc.sysinit scripts - manual page modifications for lcrash/lkcd_config - updated spec file to build new lkcdutils-4.0 - all 2.4 code is checked in, all header mods done The _only_ things left to fix on my plate includes: SMP issue (not always dumping) gzip dump compression changes (kernel/lcrash) After those two are done, then we talk about multiple dump devices, multiple dump methods, integrating all your non-disruptive dumping code, new kdb/kgdb/dprobes hooks, and adding in the dump() functionality to the block_device_operations structure, and then finishing up an IDE dump function. Should be fun! > Regards > Suparna Thanks, Suparna. :) BTW, I'll be out on #lkcd late tonight to discuss some of this. For those that are curious, we're currently connecting to irc.kernel.org/#lkcd with IRC to talk about this stuff pretty late in the evening. --Matt From owner-lkcd@oss.sgi.com Wed Sep 5 15:44:18 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f85MiIn01349 for lkcd-outgoing; Wed, 5 Sep 2001 15:44:18 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f85Mi2d01340 for ; Wed, 5 Sep 2001 15:44:02 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f85Md8O01891; Wed, 5 Sep 2001 15:39:08 -0700 Message-ID: <3B96AB52.7D527466@alacritech.com> Date: Wed, 05 Sep 2001 15:46:42 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: r1vamsi@in.ibm.com CC: "Schaal, Richard" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk As far as 'kdb' and 'lkcd' is concerned (excluding 'kgdb' for the moment), anyone hankering to work on this? Otherwise, it goes on the list of things-to-do. I've got a few things on my plate at the moment so I can't go off and do this right now. Later, yes, but if someone wants this in 4.0, please speak up now so it's on the list of included items. :) --Matt r1vamsi@in.ibm.com wrote: > > Richard, > > I agree with you completely on the rationale for wanting to dump from kdb. > In fact, one could choose to trigger a dump (after which the system will > likely continue to run), not just from KDB, which requires manual > intervention, but from other debugging tools such as the IBM Dynamic > Probes, where this could be done automatically. > > We are building "non-disruptive" dumps capability into lkcd, which will let > the system continue normal execution after the dump is taken. > > These features will probably find more use when dumps are used for > debugging other problem situations like performace related problems besides > oops/panics. > > Regards.. Vamsi. > > Vamsi Krishna S. > Linux Technology Center, > IBM Software Lab, Bangalore. > Ph: +91 80 5262355 Extn: 3959 > Internet: r1vamsi@in.ibm.com > > "Matt D. Robinson" on 09/05/2001 05:07:14 AM > > Please respond to "Matt D. Robinson" > > To: "Schaal, Richard" > cc: S Vamsikrishna/India/IBM@IBMIN, lkcd@oss.sgi.com, > akale@users.sourceforge.net, kaos@ocs.com.au > Subject: Re: LKCD + KDB ? > > "Schaal, Richard" wrote: > > > > I think it would be relatively simple to have the dump_init code register > a > > dump system > > function with the kernel debugger so that you could dump the system on > > demand. Note that > > not all problems are Oops related, and that a hung system, or one that is > > grossly under performing > > would be useful to get a snapshot of the activity at that time. Manual > > entry to the debugger > > and manual dump would seem to be a useful thing. - System survivability > > after such a dump would be > > nice, but not a show stopper at this point. > > You should already be able to do this with dump_function_ptr in the > latest code. This should be assigned to dump_execute (at least in > the last check-in I made). So if you call that address, you'll get > the dump function pointer. > > > So far as the dumping or not after an oops and entering kdb, there is a > > differentiation as to the reason > > for entering the debugger - you might derive a dump/no dump directive > from > > whether you enter the debugger > > by reason of breakpoint or oops? > > I'm curious, how many people drop into kdb, and then want to take a dump? > I'd think that this is very useful for developers, but not as useful for > customers who want to crash and reboot. > > > I used to work for Stratus Computer - at that time, a panic or oops would > > put us into the debugger, and if we > > were successful in patching up the problem, the system could resume > > execution. In Linux, after an oops, maybe > > a "nodump" command would be useful as well to disable the dumping that > might > > normally occur. > > This is fine -- I think these are all reasonable extensions to KDB, and > I can work with that developer if need be to make that happen. There's > an easy solution, one way or another. > > --Matt > > > Regards, > > Richard > > > > -----Original Message----- > > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > > Sent: Monday, September 03, 2001 2:55 AM > > To: Matt D. Robinson > > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > > Subject: Re: LKCD + KDB ? > > > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > > dump_execute will be called after we exit the debugger. > > > > If all you want is to disable dump taking after exiting debugger, that is > > easy enough with editing the dump_okay flag from within the debugger (or > > add a kdb command to do this) as Matt points out. Assuming there is a > good > > reason for wanting to take the dump from within the debugger, one should > > add a simple dump command to kdb, which will just call dump_execute with > > proper regs. What you could do today is to set eip to dump_execute from > > with in the kernel, editing the stack to push correct params :-) (not as > > hard as it sounds, really) > > > > However, the cleaner approach obviously is to add the kdb dump command, > > once we understand a little better why exactly would one want to dump > from > > within the debugger (on an oops). > > > > Regards.. Vamsi. > > > > Vamsi Krishna S. > > Linux Technology Center, > > IBM Software Lab, Bangalore. > > Ph: +91 80 5262355 Extn: 3959 > > Internet: r1vamsi@in.ibm.com > > > > Please respond to "Matt D. Robinson" > > > > To: richard.schaal@intel.com > > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > > Subject: Re: LKCD + KDB ? > > > > Richard Schaal wrote: > > > > > > > > > My question is this - I have been a fan of the kernel debugger for some > > > time, and have had a bit of difficulty > > > resolving how to configure both capabilities into my kernel. I guess > > > what I'd like to have happen is to > > > have the system enter the debugger on an oops, then have the option of > > > dumping the system from the debugger, or > > > to dump the system automatically after the debugger is exited. > > > > There's no great way to do this right now. If in kdb you can set the > > field of 'dump_okay' field to FALSE, then reset it after dropping back > > from the debugger state, that'd be fine. I guess we could also add in > > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > > and when dump_execute() gets called, dump_kdb is checked, and if set > > to TRUE, resets it to FALSE. Then add a kdb command that sets the > > field for you ... > > > > Would that work? > > > > --Matt > > > > > What is your thinking on this? Did I goof something up in applying the > > > patches for the two features? > > > > > > Thanks, > > > Richard > > > > > > -- > > > Richard.Schaal@intel.com Intel Corporation > > > Ph: (408)765-1579 Richard Schaal > > > Mail Stop SC12-308 > > > 3600 Juliette Lane > > > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Wed Sep 5 15:49:54 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f85Mnsk01455 for lkcd-outgoing; Wed, 5 Sep 2001 15:49:54 -0700 Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f85Mnhd01451 for ; Wed, 5 Sep 2001 15:49:43 -0700 Received: from alacritech.com (lambda.alacritech.com [10.1.1.32]) by smtp.alacritech.com (8.11.0/8.11.0) with ESMTP id f85MilO02101; Wed, 5 Sep 2001 15:44:47 -0700 Message-ID: <3B96ACA5.9399AB57@alacritech.com> Date: Wed, 05 Sep 2001 15:52:21 -0700 From: "Matt D. Robinson" Organization: Alacritech, Inc. X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2 i686) X-Accept-Language: en MIME-Version: 1.0 To: "Amit S. Kale" CC: "Schaal, Richard" , "'r1vamsi@in.ibm.com'" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? References: <68843F808BE5D311AC6100A0C9C5786648485D@fmsmsx50.fm.intel.com> <3B956DC7.F5F3559F@alacritech.com> <3B959ED0.E33BA3BC@vsnl.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk "Amit S. Kale" wrote: > > Hi Matt, > > I have faced several times the problem of crash dumps not being > available > in kgdb. Many a times I don't have time to debug a panic immediately, so > I keep the machine inside the debugger. A crash dump will enable me to > save > a crash dump and continue testing. I can get back to the core dump > later. > > Usually it's a good idea to save cores for all non-trivial problems once > a product goes alpha. If a problem which is supposedly fixed resurfaces, > it's very difficult to say whether it's the same problem in absence of a > core > dump. > > In ideal world, all problems should be fixed immediately and completely > using a debugger and we wouldn't need crash dumps. > > I guess it's time to think about making kgdb understand lkcd interface. This should be a pretty straightfoward thing to add. If you don't want to do it, let me know, so I can put it on my list. I think it's a high enough priority to get done for the next release, if humanly possible. Your latest stuff is in the source tarballs on kgdb.sourceforge.net? --Matt > "Matt D. Robinson" wrote: > > > > "Schaal, Richard" wrote: > > > > > > Hi Matt, > > > When you refer to the "latest code", what is that? I don't see anything on > > > source forge as released code, and the > > > latest from the SGI site has patches up to linux-2.4.4 is that what you were > > > referring to? > > > > > > Thanks, > > > Richard > > > > The latest code is in the SourceForge tree ... look in > > 2.4/drivers/block/dump.c, > > and you'll see the restructuring changes. 'lcrash' has also changed a bit. > > I copied the LKCD group on my last check-in. If you didn't get a copy of it, > > let me know. It touched a bunch of files. > > > > I have to check in new scripts and a new dumpconfig utility next (and fix > > this bloody SMP problem now that I actually have an SMP system again to test > > against). > > > > --Matt > > > > > > > > -----Original Message----- > > > From: Matt D. Robinson [mailto:yakker@alacritech.com] > > > Sent: Tuesday, September 04, 2001 4:37 PM > > > To: Schaal, Richard > > > Cc: 'r1vamsi@in.ibm.com'; lkcd@oss.sgi.com; akale@users.sourceforge.net; > > > kaos@ocs.com.au > > > Subject: Re: LKCD + KDB ? > > > > > > "Schaal, Richard" wrote: > > > > > > > > I think it would be relatively simple to have the dump_init code register > > > a > > > > dump system > > > > function with the kernel debugger so that you could dump the system on > > > > demand. Note that > > > > not all problems are Oops related, and that a hung system, or one that is > > > > grossly under performing > > > > would be useful to get a snapshot of the activity at that time. Manual > > > > entry to the debugger > > > > and manual dump would seem to be a useful thing. - System survivability > > > > after such a dump would be > > > > nice, but not a show stopper at this point. > > > > > > You should already be able to do this with dump_function_ptr in the > > > latest code. This should be assigned to dump_execute (at least in > > > the last check-in I made). So if you call that address, you'll get > > > the dump function pointer. > > > > > > > So far as the dumping or not after an oops and entering kdb, there is a > > > > differentiation as to the reason > > > > for entering the debugger - you might derive a dump/no dump directive from > > > > whether you enter the debugger > > > > by reason of breakpoint or oops? > > > > > > I'm curious, how many people drop into kdb, and then want to take a dump? > > > I'd think that this is very useful for developers, but not as useful for > > > customers who want to crash and reboot. > > > > > > > I used to work for Stratus Computer - at that time, a panic or oops would > > > > put us into the debugger, and if we > > > > were successful in patching up the problem, the system could resume > > > > execution. In Linux, after an oops, maybe > > > > a "nodump" command would be useful as well to disable the dumping that > > > might > > > > normally occur. > > > > > > This is fine -- I think these are all reasonable extensions to KDB, and > > > I can work with that developer if need be to make that happen. There's > > > an easy solution, one way or another. > > > > > > --Matt > > > > > > > Regards, > > > > Richard > > > > > > > > -----Original Message----- > > > > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > > > > Sent: Monday, September 03, 2001 2:55 AM > > > > To: Matt D. Robinson > > > > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > > > > Subject: Re: LKCD + KDB ? > > > > > > > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > > > > dump_execute will be called after we exit the debugger. > > > > > > > > If all you want is to disable dump taking after exiting debugger, that is > > > > easy enough with editing the dump_okay flag from within the debugger (or > > > > add a kdb command to do this) as Matt points out. Assuming there is a good > > > > reason for wanting to take the dump from within the debugger, one should > > > > add a simple dump command to kdb, which will just call dump_execute with > > > > proper regs. What you could do today is to set eip to dump_execute from > > > > with in the kernel, editing the stack to push correct params :-) (not as > > > > hard as it sounds, really) > > > > > > > > However, the cleaner approach obviously is to add the kdb dump command, > > > > once we understand a little better why exactly would one want to dump from > > > > within the debugger (on an oops). > > > > > > > > Regards.. Vamsi. > > > > > > > > Vamsi Krishna S. > > > > Linux Technology Center, > > > > IBM Software Lab, Bangalore. > > > > Ph: +91 80 5262355 Extn: 3959 > > > > Internet: r1vamsi@in.ibm.com > > > > > > > > Please respond to "Matt D. Robinson" > > > > > > > > To: richard.schaal@intel.com > > > > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > > > > Subject: Re: LKCD + KDB ? > > > > > > > > Richard Schaal wrote: > > > > > > > > > > > > > > > My question is this - I have been a fan of the kernel debugger for some > > > > > time, and have had a bit of difficulty > > > > > resolving how to configure both capabilities into my kernel. I guess > > > > > what I'd like to have happen is to > > > > > have the system enter the debugger on an oops, then have the option of > > > > > dumping the system from the debugger, or > > > > > to dump the system automatically after the debugger is exited. > > > > > > > > There's no great way to do this right now. If in kdb you can set the > > > > field of 'dump_okay' field to FALSE, then reset it after dropping back > > > > from the debugger state, that'd be fine. I guess we could also add in > > > > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > > > > and when dump_execute() gets called, dump_kdb is checked, and if set > > > > to TRUE, resets it to FALSE. Then add a kdb command that sets the > > > > field for you ... > > > > > > > > Would that work? > > > > > > > > --Matt > > > > > > > > > What is your thinking on this? Did I goof something up in applying the > > > > > patches for the two features? > > > > > > > > > > Thanks, > > > > > Richard > > > > > > > > > > -- > > > > > Richard.Schaal@intel.com Intel Corporation > > > > > Ph: (408)765-1579 Richard Schaal > > > > > Mail Stop SC12-308 > > > > > 3600 Juliette Lane > > > > > "I can type faster than I think!" Santa Clara, CA 95052 > > -- > Amit S. Kale > Linux Consultant, Pune, India. (kgdb@vsnl.net) > Linux kernel source level debugger http://kgdb.sourceforge.net/ > Translation filesystem http://trfs.sourceforge.net/ From owner-lkcd@oss.sgi.com Wed Sep 5 22:30:25 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f865UP708194 for lkcd-outgoing; Wed, 5 Sep 2001 22:30:25 -0700 Received: from ausmtp01.au.ibm.com (ausmtp01.au.ibm.COM [202.135.136.97]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f865U2d08188 for ; Wed, 5 Sep 2001 22:30:03 -0700 Received: from f02n15e.au.ibm.com by ausmtp01.au.ibm.com (IBM AP 2.0) with ESMTP id f865KMT350160; Thu, 6 Sep 2001 15:20:22 +1000 Received: from d73mta01.au.ibm.com (f06n01s [9.185.166.65]) by f02n15e.au.ibm.com (8.11.1m3/NCO v4.97.1) with SMTP id f865NlI125110; Thu, 6 Sep 2001 15:23:47 +1000 Received: by d73mta01.au.ibm.com(Lotus SMTP MTA v4.6.5 (863.2 5-20-1999)) id CA256ABF.001DA6A3 ; Thu, 6 Sep 2001 15:23:52 +1000 X-Lotus-FromDomain: IBMIN@IBMAU From: vamsi_krishna@in.ibm.com To: "Matt D. Robinson" cc: "Schaal, Richard" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au, hanrahat@us.ibm.com, richardj_moore@uk.ibm.com Message-ID: Date: Thu, 6 Sep 2001 11:04:32 +0530 Subject: Re: LKCD + KDB ? Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-lkcd@oss.sgi.com Precedence: bulk I will have a go at it. What is the time frame for 4.0? Regards.. Vamsi. Vamsi Krishna S. Linux Technology Center, IBM Software Lab, Bangalore. Ph: +91 80 5262355 Extn: 3959 Internet: r1vamsi@in.ibm.com "Matt D. Robinson" on 09/06/2001 04:16:42 AM Please respond to "Matt D. Robinson" To: S Vamsikrishna/India/IBM@IBMIN cc: "Schaal, Richard" , lkcd@oss.sgi.com, akale@users.sourceforge.net, kaos@ocs.com.au Subject: Re: LKCD + KDB ? As far as 'kdb' and 'lkcd' is concerned (excluding 'kgdb' for the moment), anyone hankering to work on this? Otherwise, it goes on the list of things-to-do. I've got a few things on my plate at the moment so I can't go off and do this right now. Later, yes, but if someone wants this in 4.0, please speak up now so it's on the list of included items. :) --Matt r1vamsi@in.ibm.com wrote: > > Richard, > > I agree with you completely on the rationale for wanting to dump from kdb. > In fact, one could choose to trigger a dump (after which the system will > likely continue to run), not just from KDB, which requires manual > intervention, but from other debugging tools such as the IBM Dynamic > Probes, where this could be done automatically. > > We are building "non-disruptive" dumps capability into lkcd, which will let > the system continue normal execution after the dump is taken. > > These features will probably find more use when dumps are used for > debugging other problem situations like performace related problems besides > oops/panics. > > Regards.. Vamsi. > > Vamsi Krishna S. > Linux Technology Center, > IBM Software Lab, Bangalore. > Ph: +91 80 5262355 Extn: 3959 > Internet: r1vamsi@in.ibm.com > > "Matt D. Robinson" on 09/05/2001 05:07:14 AM > > Please respond to "Matt D. Robinson" > > To: "Schaal, Richard" > cc: S Vamsikrishna/India/IBM@IBMIN, lkcd@oss.sgi.com, > akale@users.sourceforge.net, kaos@ocs.com.au > Subject: Re: LKCD + KDB ? > > "Schaal, Richard" wrote: > > > > I think it would be relatively simple to have the dump_init code register > a > > dump system > > function with the kernel debugger so that you could dump the system on > > demand. Note that > > not all problems are Oops related, and that a hung system, or one that is > > grossly under performing > > would be useful to get a snapshot of the activity at that time. Manual > > entry to the debugger > > and manual dump would seem to be a useful thing. - System survivability > > after such a dump would be > > nice, but not a show stopper at this point. > > You should already be able to do this with dump_function_ptr in the > latest code. This should be assigned to dump_execute (at least in > the last check-in I made). So if you call that address, you'll get > the dump function pointer. > > > So far as the dumping or not after an oops and entering kdb, there is a > > differentiation as to the reason > > for entering the debugger - you might derive a dump/no dump directive > from > > whether you enter the debugger > > by reason of breakpoint or oops? > > I'm curious, how many people drop into kdb, and then want to take a dump? > I'd think that this is very useful for developers, but not as useful for > customers who want to crash and reboot. > > > I used to work for Stratus Computer - at that time, a panic or oops would > > put us into the debugger, and if we > > were successful in patching up the problem, the system could resume > > execution. In Linux, after an oops, maybe > > a "nodump" command would be useful as well to disable the dumping that > might > > normally occur. > > This is fine -- I think these are all reasonable extensions to KDB, and > I can work with that developer if need be to make that happen. There's > an easy solution, one way or another. > > --Matt > > > Regards, > > Richard > > > > -----Original Message----- > > From: r1vamsi@in.ibm.com [mailto:r1vamsi@in.ibm.com] > > Sent: Monday, September 03, 2001 2:55 AM > > To: Matt D. Robinson > > Cc: richard.schaal@intel.com; lkcd@oss.sgi.com > > Subject: Re: LKCD + KDB ? > > > > When both KDB and LKCD patches are applied, we drop into KDB on an oops. > > dump_execute will be called after we exit the debugger. > > > > If all you want is to disable dump taking after exiting debugger, that is > > easy enough with editing the dump_okay flag from within the debugger (or > > add a kdb command to do this) as Matt points out. Assuming there is a > good > > reason for wanting to take the dump from within the debugger, one should > > add a simple dump command to kdb, which will just call dump_execute with > > proper regs. What you could do today is to set eip to dump_execute from > > with in the kernel, editing the stack to push correct params :-) (not as > > hard as it sounds, really) > > > > However, the cleaner approach obviously is to add the kdb dump command, > > once we understand a little better why exactly would one want to dump > from > > within the debugger (on an oops). > > > > Regards.. Vamsi. > > > > Vamsi Krishna S. > > Linux Technology Center, > > IBM Software Lab, Bangalore. > > Ph: +91 80 5262355 Extn: 3959 > > Internet: r1vamsi@in.ibm.com > > > > Please respond to "Matt D. Robinson" > > > > To: richard.schaal@intel.com > > cc: lkcd@oss.sgi.com (bcc: S Vamsikrishna/India/IBM) > > Subject: Re: LKCD + KDB ? > > > > Richard Schaal wrote: > > > > > > > > > My question is this - I have been a fan of the kernel debugger for some > > > time, and have had a bit of difficulty > > > resolving how to configure both capabilities into my kernel. I guess > > > what I'd like to have happen is to > > > have the system enter the debugger on an oops, then have the option of > > > dumping the system from the debugger, or > > > to dump the system automatically after the debugger is exited. > > > > There's no great way to do this right now. If in kdb you can set the > > field of 'dump_okay' field to FALSE, then reset it after dropping back > > from the debugger state, that'd be fine. I guess we could also add in > > something for kdb, a one-time thing, so kdb can set dump_kdb to TRUE, > > and when dump_execute() gets called, dump_kdb is checked, and if set > > to TRUE, resets it to FALSE. Then add a kdb command that sets the > > field for you ... > > > > Would that work? > > > > --Matt > > > > > What is your thinking on this? Did I goof something up in applying the > > > patches for the two features? > > > > > > Thanks, > > > Richard > > > > > > -- > > > Richard.Schaal@intel.com Intel Corporation > > > Ph: (408)765-1579 Richard Schaal > > > Mail Stop SC12-308 > > > 3600 Juliette Lane > > > "I can type faster than I think!" Santa Clara, CA 95052 From owner-lkcd@oss.sgi.com Wed Sep 5 23:10:38 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f866AcB08671 for lkcd-outgoing; Wed, 5 Sep 2001 23:10:38 -0700 Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f866Aad08668 for ; Wed, 5 Sep 2001 23:10:36 -0700 Received: from localhost (yakker@localhost) by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id f866Bc716570; Wed, 5 Sep 2001 23:11:38 -0700 Date: Wed, 5 Sep 2001 23:11:38 -0700 (PDT) From: "Matt D. Robinson" To: cc: "Matt D. Robinson" , "Schaal, Richard" , , , , , Subject: Re: LKCD + KDB ? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk On Thu, 6 Sep 2001 vamsi_krishna@in.ibm.com wrote: |>I will have a go at it. What is the time frame for 4.0? |> |>Regards.. Vamsi. First off, thanks, Vamsi ... if you can get it done, great. It's as soon as I've got something from Suparna for LKCD, and 'lcrash' can go right now as-is. I'd like to get non-disruptive dumping in there, and if at all possible, the MCL changes will go in along with the gzip compression code. I'm looking at about a week. I'd like to not stretch it out too much further if at all possible. Let's say 9/14. --Matt From owner-lkcd@oss.sgi.com Thu Sep 6 00:09:20 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f8679Kr09917 for lkcd-outgoing; Thu, 6 Sep 2001 00:09:20 -0700 Received: from fgwmail6.fujitsu.co.jp (fgwmail6.fujitsu.co.jp [192.51.44.36]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f8679Gd09914 for ; Thu, 6 Sep 2001 00:09:17 -0700 Received: from m3.gw.fujitsu.co.jp by fgwmail6.fujitsu.co.jp (8.9.3/3.7W-MX0108-Fujitsu Gateway) id QAA20789; Thu, 6 Sep 2001 16:08:59 +0900 (JST) (envelope-from naomi@pst.fujitsu.com) From: naomi@pst.fujitsu.com Received: from naomi.aoi.pst.fujitsu.com by m3.gw.fujitsu.co.jp (8.9.3/3.7W-0108-Fujitsu Domain Master) id QAA16055; Thu, 6 Sep 2001 16:08:55 +0900 (JST) (envelope-from naomi@pst.fujitsu.com) Received: from localhost (IDENT:naomi@localhost [127.0.0.1]) by naomi.aoi.pst.fujitsu.com (8.9.3/8.9.3) with ESMTP id QAA18666; Thu, 6 Sep 2001 16:08:31 +0900 To: yakker@alacritech.com Cc: lkcd@oss.sgi.com Subject: Re: lcrash sub-commands line completion In-Reply-To: Your message of "Tue, 04 Sep 2001 01:14:07 -0700" <3B948D4F.9D7257B1@alacritech.com> References: <3B948D4F.9D7257B1@alacritech.com> X-Mailer: Mew version 1.92.4 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20010906160831D.naomi@pst.fujitsu.com> Date: Thu, 06 Sep 2001 16:08:31 +0900 X-Dispatcher: imput version 980905(IM100) Lines: 31 Sender: owner-lkcd@oss.sgi.com Precedence: bulk Hi, Matt-san. Since I have not finished testing this yet, I think that it is difficult to roll it into 4.0. I'd appreciate it if you would roll it into the next release (4.1 or more?). Naomi.Haseo From: "Matt D. Robinson" Subject: Re: lcrash sub-commands line completion Date: Tue, 04 Sep 2001 01:14:07 -0700 > This sounds like a great thing to add. I have no problems with it. > Note that we used to have a readline capability, but we removed it > due to some of the GPL/LGPL licensing conflicts. > > Please let me know if you complete this in the future. I'm still > planning to roll a 4.0 release as soon as I talk to the IBM folks > about the last code drop I gave them. > > For those who are working directly in the tree, you'll note we're > now moving from 'vmdump' to 'dump' conventions, and hopefully all > the future scripts will use this as well. > > Also, I spoke to someone at MCL, and we'll see how we can roll in > mcore into the LKCD project in some capacity. > > Have at it, Naomi-san. :) > > --Matt From owner-lkcd@oss.sgi.com Thu Sep 6 00:23:55 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f867Ntq10276 for lkcd-outgoing; Thu, 6 Sep 2001 00:23:55 -0700 Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f867Npd10273 for ; Thu, 6 Sep 2001 00:23:51 -0700 Received: from alacritech.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id f867SQJ16651; Thu, 6 Sep 2001 00:28:26 -0700 Message-ID: <3B9723FA.C8F4ADB0@alacritech.com> Date: Thu, 06 Sep 2001 00:21:30 -0700 From: "Matt D. Robinson" X-Mailer: Mozilla 4.75 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: naomi@pst.fujitsu.com CC: lkcd@oss.sgi.com Subject: Re: lcrash sub-commands line completion References: <3B948D4F.9D7257B1@alacritech.com> <20010906160831D.naomi@pst.fujitsu.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk 4.1 (or 4.0.1) is fine -- I don't suspect it'll be that long between releases, as there's a number of people working on LKCD right now, and everyone wants to get their stuff rolled in and turned over more quickly (which I agree with). --Matt naomi@pst.fujitsu.com wrote: > > Hi, Matt-san. > > Since I have not finished testing this yet, I think that it is difficult > to roll it into 4.0. > I'd appreciate it if you would roll it into the next release (4.1 or more?). > > Naomi.Haseo > > From: "Matt D. Robinson" > Subject: Re: lcrash sub-commands line completion > Date: Tue, 04 Sep 2001 01:14:07 -0700 > > > This sounds like a great thing to add. I have no problems with it. > > Note that we used to have a readline capability, but we removed it > > due to some of the GPL/LGPL licensing conflicts. > > > > Please let me know if you complete this in the future. I'm still > > planning to roll a 4.0 release as soon as I talk to the IBM folks > > about the last code drop I gave them. > > > > For those who are working directly in the tree, you'll note we're > > now moving from 'vmdump' to 'dump' conventions, and hopefully all > > the future scripts will use this as well. > > > > Also, I spoke to someone at MCL, and we'll see how we can roll in > > mcore into the LKCD project in some capacity. > > > > Have at it, Naomi-san. :) > > > > --Matt From owner-lkcd@oss.sgi.com Thu Sep 6 00:25:33 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f867PXo10331 for lkcd-outgoing; Thu, 6 Sep 2001 00:25:33 -0700 Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f867PWd10328 for ; Thu, 6 Sep 2001 00:25:32 -0700 Received: from alacritech.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id f867UCJ16661 for ; Thu, 6 Sep 2001 00:30:12 -0700 Message-ID: <3B972464.609F31D8@alacritech.com> Date: Thu, 06 Sep 2001 00:23:16 -0700 From: "Matt D. Robinson" X-Mailer: Mozilla 4.75 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: lkcd@oss.sgi.com Subject: Query regarding LKCD kernel patch and RPM/tar.gz ... Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk I'd like to start releasing the LKCD kernel patch as part of the base RPM and/or include it alongside the release. The whole point behind moving to number synchronization between the kernel patch and the RPM/tar.gz is to make sure things are in line. If the kernel patch is released in the lkcdutils RPM/tar.gz, this becomes much easier. Is this a problem for anyone, especially those rolling their own distributions? If it is, let me know, as I'm flexible. --Matt From owner-lkcd@oss.sgi.com Thu Sep 6 02:26:36 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f869Qa912382 for lkcd-outgoing; Thu, 6 Sep 2001 02:26:36 -0700 Received: from fgwmail7.fujitsu.co.jp (fgwmail7.fujitsu.co.jp [192.51.44.37]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f869QJd12379 for ; Thu, 6 Sep 2001 02:26:19 -0700 Received: from m4.gw.fujitsu.co.jp by fgwmail7.fujitsu.co.jp (8.9.3/3.7W-MX0108-Fujitsu Gateway) id SAA15521; Thu, 6 Sep 2001 18:26:07 +0900 (JST) (envelope-from m-kotani@pst.fujitsu.com) Received: from classic.aoi.pst.fujitsu.com by m4.gw.fujitsu.co.jp (8.9.3/3.7W-0108-Fujitsu Domain Master) id SAA07839; Thu, 6 Sep 2001 18:25:58 +0900 (JST) (envelope-from m-kotani@pst.fujitsu.com) Received: from doll (doll.aoi.pst.fujitsu.com [172.23.72.214]) by classic.aoi.pst.fujitsu.com (8.9.3/8.9.3) with SMTP id SAA05841; Thu, 6 Sep 2001 18:25:48 +0900 Message-ID: <008401c136b6$05e00600$d64817ac@aoi.pst.fujitsu.com> From: "Masashige Kotani" To: "Matt D. Robinson" Cc: , "Howell, David P" References: <10C8636AE359D4119118009027AE99870CE2F95B@FMSMSX34> <3B9563E8.9A432B7B@alacritech.com> Subject: Re: multiple dump devices Date: Thu, 6 Sep 2001 18:26:31 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2615.200 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200 Sender: owner-lkcd@oss.sgi.com Precedence: bulk > "Howell, David P" wrote: > > > > We are working on a proposal for redundant dump device support that I plan > > to > > share in the next few weeks; I've got a prototype mostly working that can be > > > > contributed. Let me know how you are approaching this, I'll send details of > > what we are doing here later this week. Sounds like a good opportunity for > > collaboration on this. > > > > Regards, > > Dave Howell > > I'm really curious as to the proposal. Sounds like a good idea, the > real question becomes, do you want to chain multiple dump devices with > multiple dump mechanisms? > > Here's where I'm going with this. I just finished the code to allow > people to install their own dump compression mechanisms (right now, it'll > be RLE, I have to check in the GZIP compression module, and people can > put in whatever one they want). Do you want to take the next step and > let people have chains of dump mechanisms based on the dump condition? > I realize multiple dump devices is good, but what if you could plug in > your own dump method with it? Then that dump method could query the > available dump devices configured. > > So you'd have: > > dump methods (one standard, but plug-and-play) > dump devices (requires at least one, multiples allowed, maybe > access lists for methods?) > dump compressions (configurable, usable by some methods) Do you mean as follows, Matt? "Dump methods" means how to use devices configured for dump device to save memory dump, and each of them should be pluggable? (single device as standard, concatenating devices as single dump device, mirroring devices for redundancy ...) Each "dump devices" should be independently configurable about type of compression and dump method ? --Masashige > Would this be the eventual goal? That way, everything is tunable to > their own liking. I figured I'd ask, since if you're going to add in > multiple dump devices, and we've gone to multiple compression types, > you might as well go all the way and add dump methods as well. I > don't know what the rest of the group thinks, but this could be > very useful. > > I'd definitely like to get some feedback ... this is all doable, > as long as the dump compression code is in 'lcrash', and the pages > are dumped in a way that we can find the location in memory, this > can work pretty sweet for everyone here. > > --Matt From owner-lkcd@oss.sgi.com Thu Sep 6 02:26:43 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f869Qh512396 for lkcd-outgoing; Thu, 6 Sep 2001 02:26:43 -0700 Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f868hQd11581 for ; Thu, 6 Sep 2001 01:43:27 -0700 Received: from localhost (yakker@localhost) by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id f868ih016827; Thu, 6 Sep 2001 01:44:43 -0700 Date: Thu, 6 Sep 2001 01:44:43 -0700 (PDT) From: "Matt D. Robinson" To: Kapish K cc: Subject: Re: Re: lcrash and vmdump In-Reply-To: <200109041939.PAA18654@www23.ureach.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk I'm copying 'lkcd@oss.sgi.com', as someone may find this useful. On Tue, 4 Sep 2001, Kapish K wrote: |>Hello, |> Yes, I understand that - but what I am looking for is to be |>able to know where to look for what.. and the syntax of the |>various commands aren't very well explained or illustrate in the |>online help under lcrash - for example, I need to walk through a |>list fo struct tasks - how do I know the start or for say, one |>particular process, how do I get to the address for the start of |>the task_struct and once given that, how can I use the walk |>comamnd. tried using it, but could not quite understand it.. |>same with mmap - how to know the mmap_list .. lcrash does nto |>say anything about that... also, how to know which pages have |>been mapped or used by which process at that point in time.. |>questin like these are what I am looking answers for... any |>place where some doc exists or do I necessarily have to look at |>code only?? All good questions ... To get the tasks, run 'task'. You'll see the addresses of the structures in the first column. You can then run 'task ' to see the task. 'task -f ' show a little bit more data, and if you want to show _everything_, run 'px *(struct task_struct *)', and then you'll see all the fields. 'px' or 'print' shows you everything in the structure. There's also 'walk', such as 'walk task_struct next_task ', which will walk through the next_task pointers for you. Try 'prev_task' instead of 'next_task' if you're curious. Or read the code in lkcdutils/lcrash/cmds/cmd_walk.c. With 'mmap', you'll need to know where in memory an mm_struct is. For example, look at the 'active_mm' field in the task struct when you run the 'px' command previous listed. If it isn't NULL, you can run 'mmap -f ', where is the address listed as the task_struct.active_mm. All this really involves looking at the code. There are some things you can do as far as debugging is concerned to get a quicker answer, but crash dump analysis is really the science of reading kernel code to figure out why the memory information is wrong. It's not that simple to do, especially on more complex kernels. Linux in many ways is far easier than most OSes; there isn't that much complexity in the SMP code compared to, say, IRIX. But give it some time ... Kernel crash dump analysis really works as follows: 1) figure out what was running (straightforward enough); 2) figure out what those tasks were