lkcd
[Top] [All Lists]

Re: FW: Lkcd crashes when a crash is detected.

To: Jeff Goldszer <Jeff_Goldszer@xxxxxxx>
Subject: Re: FW: Lkcd crashes when a crash is detected.
From: "Matt D. Robinson" <yakker@xxxxxxxxxxxxxx>
Date: Thu, 12 Jul 2001 00:30:05 -0700
Cc: "'lkcd@xxxxxxxxxxx'" <lkcd@xxxxxxxxxxx>
References: <1139BB776563D011BFB600805FC1048FC1B35C@xxxxxxxxxxxxxx>
Sender: owner-lkcd@xxxxxxxxxxx

Jeff Goldszer wrote:
> 
> Matt,
> 
> Thank you for the quick response. I will try the patch. I am pretty sure
> that I can reproduce the problem. Hard for me to say because I was not sure
> is if lkcd was failing because of a configuration issue. I have to do more
> testing to let you know. I will only do the testing further testing with the
> new patch unless told otherwise.

Okee, sounds good.  Again, if you use 'lkcd' and disassemble the
addresses (from the original kernel) that I mentioned in my previous
E-mail, we can determine what the PC (eip) instruction was that
the crash occurred at.  If it is dump_??*() anything, it's an LKCD
issue.  Otherwise, it's because of something else while writing
out the pages.

> I "hacked" many kinds of combinations of lkcd setup. Swap space linked to
> raw character character devices ( Comments found in /etc/sysconfig/vmdump ),
> Deciding what a swap partition was ( could it be a file or was is just a
> partition that was run with mkswap? Did swapon on have to be run on the
> partition for it to be used?  Could a pre existing swap partitions be used?
> )

Pre-existing swap spaces can be used.  You don't have to use 'swapon' to
activate the swap partition, as long as the swap header exists.  The
best
thing to do is to link /dev/vmdump to /dev/sdb1 (I think that's the
right
swap device in your case), and you're all set.  You normally don't have
to set it up by yourself, if you've got your swap partitions configured
in your /etc/fstab -- the 'vmdump' script creates the link to the first
swap device for you automatically.

> I wasn't sure how to set up LKCD until I stumble on to older documentation
> (Original SCSI documentation) and the mail listing.
> Existing swap space can be used by lkcd,  The swap partition has to be
> mounted using swapon or fstab. I assume the swap partition can not be a
> file.

The swap partition cannot be a file. :)  I'm pretty sure you don't have 
to have it mounted, but you do have to make sure the swap header is on
the partition (done by mkswap).

BTW, your dump device doesn't have to be SCSI anymore -- you can use IDE
if you want.

> Jeff Goldszer
> Senior Software Engineer
> Computer Network Technologies
> 65C Commodore Lane
> West Babylon NY, 11704
> Phone: 631-321-5118
> FAX: 631-321-5119

Let me know what results you get, Jeff.  Thanks.

--Matt

>  -----Original Message-----
> From:   Matt D. Robinson [mailto:yakker@xxxxxxxxxxxxxx]
> Sent:   Wednesday, July 11, 2001 4:20 PM
> To:     Jeff Goldszer
> Cc:     'lkcd@xxxxxxxxxxx'
> Subject:        Re: FW: Lkcd crashes when a crash is detected.
> 
>  << File: lkcd-latest.diff.gz >> Hey, Jeff.  Try the following patch
> (instead of the patch
> you are currently using), and let me know if you get the
> same results.  Also, _before_ you upgrade, use 'lcrash' on
> the system and determine what function is at c0108fb3 and
> c012a373.  I don't think you're dying in the LKCD code,
> but somewhere along the way in writing out the pages of
> memory.
> 
> We've made some changes to the way in which SMP systems are
> dealt with.  Let's just say that leaving smp_send_stop() out
> is a _bad_ thing, while leaving it in isn't so hot, either.
> 
> This is a test patch -- not for production systems.  You
> also need to modify /etc/sysconfig/vmdump and change your
> DUMP_LEVEL from 4 to 8.
> 
> Also, can you duplicate this problem, or were you testing
> the dump process on your machine?
> 
> --Matt
> 
> Jeff Goldszer wrote:
> >
> > Forgot to mention the OS is configured for SMP.
> >
> > Jeff Goldszer
> > Senior Software Engineer
> > Computer Network Technologies
> > 65C Commodore Lane
> > West Babylon NY, 11704
> > Phone: 631-321-5118
> > FAX: 631-321-5119
> >
> >  -----Original Message-----
> > From:   Jeff Goldszer
> > Sent:   Wednesday, July 11, 2001 1:24 PM
> > To:     'lkcd@xxxxxxxxxxx'
> > Cc:     Harold Stevenson; Marco DelToro; 'tjm@xxxxxxx'
> > Subject:        Lkcd crashes when a crash is detected.
> >
> > I am trying to trouble shoot a crash using lkcd (Linux Kernel Crash Dump).
> >
> > Problem: the crash dump facility is crashing after a crash is detected.
> >
> > Crash dump:
> > Unable to handle kernel NULL pointer dereference<1>Unable to handle kernel
> > paging request at virtual address 0ec4e430
> >  printing eip: c0108fb3
> > *pde = 00000000
> > Oops: 0002
> > CPU:    1471744
> > EIP:    0010:[<c0108fb3>]
> > EFLAGS: 00010006
> > eax: 13a66000   ebx: 00000000   ecx: 00000016   edx: 00000018
> > esi: c0297800   edi: 00000000   ebp: c029c906   esp: c349feb8
> > ds: 0018   es: 0018   ss: 0018
> > Process erred (pid: 1835619449, stackpage=c349f000)
> > Stack: 00167500 00000000 00000030 c029c937 c029c906 c01072b0 00000000
> > 00000016
> >        00000010 00000030 c029c937 c029c906 00000000 00000018 00000018
> > ffffff00
> >        c0114891 00000010 00000282 00000282 00000000 c029c936 00000033
> > c349e000
> > Call Trace: [<c01072b0>] [<c0114891>] [<c0110b60>] [<c0110e57>]
> [<c0110b60>]
> > [<c
> > 0107334>] [<c0110b60>]
> >        [<c0110bc2>]
> >
> > Code: ff 04 85 30 64 2b c0 f0 fe 8b 10 78 29 c0 0f 88 5d 64 0f 00
> > Dumping to device 0x341 [ide0(3,65)] ...
> > Writing dump header ...<1>Unable to handle kernel paging request at
> virtual
> > addr
> > ess 423b1045
> >  printing eip:
> > c012a373
> > *pde = 00000000
> >
> > Particulars:
> >
> > *       My development machine is a Pentium 133 with two ide drives.
> > *       The OS: Red Hat Linux release 7.0.91 (Wolverine Kernel 2.4.3 on an
> > i586)
> > *       Using lkcdutils-1.0-7 for i386
> > *       kernel patch lkcd-2.4.3.diff
> > *       /dev/vmdump is linked to /dev/hdb1
> > *       /dev/hdb1 is the swap partition currently used by the development
> > machine.
> >
> > [root@mill /root]# swapon -s
> > Filename                        Type            Size    Used    Priority
> > /dev/hdb1                       partition       133016  16      -1
> >
> > *       Could not use patch which modified rc.sysinit. I modified the
> > /etc/rc.sysinit to look like this. Note this is only a portion of the
> > rc.sysinit.
> >
> > # Mount all other filesystems (except for NFS and /proc, which is already
> > # mounted). Contrary to standard usage,
> > # filesystems are NOT unmounted in single user mode.
> > action $"Mounting local filesystems: " mount -a -t nonfs,smbfs,ncpfs
> >
> > if [ -x /sbin/vmdump ] ; then
> >    action "Configuring system for crash dumps" /sbin/vmdump config
> > fi
> >
> > if [ -x /sbin/vmdump ] ; then
> >    action "Saving crash dump (if one exists)" /sbin/vmdump save
> > fi
> >
> > if [ X"$_RUN_QUOTACHECK" = X1 -a -x /sbin/quotacheck ]; then
> >     action $"Checking filesystem quotas: "  /sbin/quotacheck -v -R -a
> > fi
> >
> > Pertinent Question:
> > *       Is my dump device set up correctly?
> > *       If necessary how to I properly setup /etc/fstab to use a swap
> > partition that is a file? Can Linux use a swap partition file and lkcd use
> > /dev/hdb1?
> >
> > Jeff Goldszer
> > Senior Software Engineer
> > Computer Network Technologies
> > 65C Commodore Lane
> > West Babylon NY, 11704
> > Phone: 631-321-5118
> > FAX: 631-321-5119

<Prev in Thread] Current Thread [Next in Thread>