Matt,
Thank you for the quick response. I will try the patch. I am pretty sure
that I can reproduce the problem. Hard for me to say because I was not sure
is if lkcd was failing because of a configuration issue. I have to do more
testing to let you know. I will only do the testing further testing with the
new patch unless told otherwise.
I "hacked" many kinds of combinations of lkcd setup. Swap space linked to
raw character character devices ( Comments found in /etc/sysconfig/vmdump ),
Deciding what a swap partition was ( could it be a file or was is just a
partition that was run with mkswap? Did swapon on have to be run on the
partition for it to be used? Could a pre existing swap partitions be used?
)
I wasn't sure how to set up LKCD until I stumble on to older documentation
(Original SCSI documentation) and the mail listing.
Existing swap space can be used by lkcd, The swap partition has to be
mounted using swapon or fstab. I assume the swap partition can not be a
file.
Jeff Goldszer
Senior Software Engineer
Computer Network Technologies
65C Commodore Lane
West Babylon NY, 11704
Phone: 631-321-5118
FAX: 631-321-5119
-----Original Message-----
From: Matt D. Robinson [mailto:yakker@xxxxxxxxxxxxxx]
Sent: Wednesday, July 11, 2001 4:20 PM
To: Jeff Goldszer
Cc: 'lkcd@xxxxxxxxxxx'
Subject: Re: FW: Lkcd crashes when a crash is detected.
<< File: lkcd-latest.diff.gz >> Hey, Jeff. Try the following patch
(instead of the patch
you are currently using), and let me know if you get the
same results. Also, _before_ you upgrade, use 'lcrash' on
the system and determine what function is at c0108fb3 and
c012a373. I don't think you're dying in the LKCD code,
but somewhere along the way in writing out the pages of
memory.
We've made some changes to the way in which SMP systems are
dealt with. Let's just say that leaving smp_send_stop() out
is a _bad_ thing, while leaving it in isn't so hot, either.
This is a test patch -- not for production systems. You
also need to modify /etc/sysconfig/vmdump and change your
DUMP_LEVEL from 4 to 8.
Also, can you duplicate this problem, or were you testing
the dump process on your machine?
--Matt
Jeff Goldszer wrote:
>
> Forgot to mention the OS is configured for SMP.
>
> Jeff Goldszer
> Senior Software Engineer
> Computer Network Technologies
> 65C Commodore Lane
> West Babylon NY, 11704
> Phone: 631-321-5118
> FAX: 631-321-5119
>
> -----Original Message-----
> From: Jeff Goldszer
> Sent: Wednesday, July 11, 2001 1:24 PM
> To: 'lkcd@xxxxxxxxxxx'
> Cc: Harold Stevenson; Marco DelToro; 'tjm@xxxxxxx'
> Subject: Lkcd crashes when a crash is detected.
>
> I am trying to trouble shoot a crash using lkcd (Linux Kernel Crash Dump).
>
> Problem: the crash dump facility is crashing after a crash is detected.
>
> Crash dump:
> Unable to handle kernel NULL pointer dereference<1>Unable to handle kernel
> paging request at virtual address 0ec4e430
> printing eip: c0108fb3
> *pde = 00000000
> Oops: 0002
> CPU: 1471744
> EIP: 0010:[<c0108fb3>]
> EFLAGS: 00010006
> eax: 13a66000 ebx: 00000000 ecx: 00000016 edx: 00000018
> esi: c0297800 edi: 00000000 ebp: c029c906 esp: c349feb8
> ds: 0018 es: 0018 ss: 0018
> Process erred (pid: 1835619449, stackpage=c349f000)
> Stack: 00167500 00000000 00000030 c029c937 c029c906 c01072b0 00000000
> 00000016
> 00000010 00000030 c029c937 c029c906 00000000 00000018 00000018
> ffffff00
> c0114891 00000010 00000282 00000282 00000000 c029c936 00000033
> c349e000
> Call Trace: [<c01072b0>] [<c0114891>] [<c0110b60>] [<c0110e57>]
[<c0110b60>]
> [<c
> 0107334>] [<c0110b60>]
> [<c0110bc2>]
>
> Code: ff 04 85 30 64 2b c0 f0 fe 8b 10 78 29 c0 0f 88 5d 64 0f 00
> Dumping to device 0x341 [ide0(3,65)] ...
> Writing dump header ...<1>Unable to handle kernel paging request at
virtual
> addr
> ess 423b1045
> printing eip:
> c012a373
> *pde = 00000000
>
> Particulars:
>
> * My development machine is a Pentium 133 with two ide drives.
> * The OS: Red Hat Linux release 7.0.91 (Wolverine Kernel 2.4.3 on an
> i586)
> * Using lkcdutils-1.0-7 for i386
> * kernel patch lkcd-2.4.3.diff
> * /dev/vmdump is linked to /dev/hdb1
> * /dev/hdb1 is the swap partition currently used by the development
> machine.
>
> [root@mill /root]# swapon -s
> Filename Type Size Used Priority
> /dev/hdb1 partition 133016 16 -1
>
> * Could not use patch which modified rc.sysinit. I modified the
> /etc/rc.sysinit to look like this. Note this is only a portion of the
> rc.sysinit.
>
> # Mount all other filesystems (except for NFS and /proc, which is already
> # mounted). Contrary to standard usage,
> # filesystems are NOT unmounted in single user mode.
> action $"Mounting local filesystems: " mount -a -t nonfs,smbfs,ncpfs
>
> if [ -x /sbin/vmdump ] ; then
> action "Configuring system for crash dumps" /sbin/vmdump config
> fi
>
> if [ -x /sbin/vmdump ] ; then
> action "Saving crash dump (if one exists)" /sbin/vmdump save
> fi
>
> if [ X"$_RUN_QUOTACHECK" = X1 -a -x /sbin/quotacheck ]; then
> action $"Checking filesystem quotas: " /sbin/quotacheck -v -R -a
> fi
>
> Pertinent Question:
> * Is my dump device set up correctly?
> * If necessary how to I properly setup /etc/fstab to use a swap
> partition that is a file? Can Linux use a swap partition file and lkcd use
> /dev/hdb1?
>
> Jeff Goldszer
> Senior Software Engineer
> Computer Network Technologies
> 65C Commodore Lane
> West Babylon NY, 11704
> Phone: 631-321-5118
> FAX: 631-321-5119
|