lkcd
[Top] [All Lists]

Re: testing lkcd

To: Brian Hall <brianw.hall@xxxxxxxxxx>
Subject: Re: testing lkcd
From: Tom Morano <tjm@xxxxxxx>
Date: Tue, 09 Nov 1999 09:56:37 -0800
Cc: lkcd@xxxxxxxxxxx
References: <XFMail.991109111119.brianw.hall@xxxxxxxxxx>
Sender: owner-lkcd@xxxxxxxxxxx
Brian Hall wrote:
> 
> OK, I have tested lkcd as described in the FAQ and it works well, so now I 
> have
> some questions.
> 
> Will the crash dump work if the interrupt handler dies? I have a script that
> will kill a 2.2.5-2.2.10 kernel via a TCP exploit, but I don't know a way to 
> do
> that with 2.2.13. Can someone tell me how to cause this? I'd like to test this
> case.
> 
> What is the purpose of the lcrash.# executables in /var/log/vmdump?

The lcrash utility includes kernel header files and directly references
kernel data structures (within the lcrash address space). In addition to
that, certain kernel build options (__SMP__ for example) may change the 
makeup of some kernel structures (e.g. task_struct). Because of this, it is 
important that you have the lcrash binary that matches the kernel you are 
trying to analyze. Otherwise you may find that the definition of a struct
in lcrash does not map to what is in kernel memory. The numbered lcrash
executables in /var/log/vmdump should be used with their respective dump
and map files. 

> 
> How do you use lcrash to debug a crash dump ? I see how to invoke it against
> the dump files, but I could use some documentation about the internal lcrash
> commands.

Some information is contained in the FAQ in our website 
    http://oss.sgi.com/projects/lkcd/faq.html

You should always start out with the report command. It will provide you with
a top-level view of how the kernel died. For example:

>> report
=======================
LCRASH CORE FILE REPORT
=======================

GENERATED ON:
    Tue Nov  9 09:42:54 1999


TIME OF CRASH:
    Mon Sep 13 17:39:36 1999


PANIC STRING:
    Oops

MAP:
    map.20

VMDUMP:
    vmdump.20

================
COREFILE SUMMARY
================

    The system died due to a software failure.

===================
UTSNAME INFORMATION
===================

   sysname : Linux
  nodename : peak-pc.engr.sgi.com
   release : 2.2.10
   version : #218 Mon Sep 13 17:22:25 PDT 1999
   machine : i686
domainname : engr.sgi.com

===============
LOG BUFFER DUMP
===============

    <4>Linux version 2.2.10 (root@xxxxxxxxxxxxxxxxxxxx) (gcc version
egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #218 Mon Sep 13 17:22:25 PDT
1999
    <4>Detected 348932846 Hz processor.
    <4>Console: colour VGA+ 80x25
    <4>Calibrating delay loop... 348.16 BogoMIPS
    <4>Memory: 95284k/98304k available (1056k kernel code, 408k reserved, 1500k
data, 56k init)
    <4>CPU: Intel Pentium II (Deschutes) stepping 02
    <6>Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
    <6>Checking 'hlt' instruction... OK.
    <4>POSIX conformance testing by UNIFIX
    <4>PCI: PCI BIOS revision 2.10 entry at 0xfcaee
    <4>PCI: Using configuration type 1
    <4>PCI: Probing PCI hardware
    <6>Linux NET4.0 for Linux 2.2
    <6>Based upon Swansea University Computer Society NET3.039
    <6>NET4: Unix domain sockets 1.0 for Linux NET4.0.
    <6>NET4: Linux TCP/IP 1.0 for NET4.0
    <6>IP Protocols: ICMP, UDP, TCP
    <4>Starting kswapd v 1.5 
    <6>Detected PS/2 Mouse Port.
    <6>Serial driver version 4.27 with no serial options enabled
    <6>ttyS00 at 0x03f8 (irq = 4) is a 16550A
    <6>ttyS01 at 0x02f8 (irq = 3) is a 16550A
    <4>pty: 256 Unix98 ptys configured
    <4>PIIX4: IDE controller on PCI bus 00 dev 39
    <4>PIIX4: not 100% native mode: will probe irqs later
    <4>    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
    <4>    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
    <4>hda: WDC AC24300L, ATA DISK drive
    <4>hdc: NEC CD-ROM DRIVE:28C, ATAPI CDROM drive
    <4>ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
    <4>ide1 at 0x170-0x177,0x376 on irq 15
    <6>hda: WDC AC24300L, 4112MB w/256kB Cache, CHS=524/255/63, UDMA
    <4>hdc: ATAPI 32X CD-ROM drive, 128kB Cache
    <6>Uniform CDROM driver Revision: 2.55
    <6>Floppy drive(s): fd0 is 1.44M
    <6>FDC 0 is a National Semiconductor PC87306
    <6>(scsi0) <Adaptec AHA-2940A Ultra SCSI host adapter> found at PCI 14/0
    <6>(scsi0) Narrow Channel, SCSI ID=7, 3/255 SCBs
    <6>(scsi0) Warning - detected auto-termination
    <6>(scsi0) Please verify driver detected settings are correct.
    <6>(scsi0) If not, then please properly set the device termination
    <6>(scsi0) in the Adaptec SCSI BIOS by hitting CTRL-A when prompted
    <6>(scsi0) during machine bootup.
    <6>(scsi0) Cables present (Int-50 YES, Ext-50 NO)
    <6>(scsi0) Downloading sequencer code... 413 instructions downloaded
    <4>scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.17/3.2.4
    <4>       <Adaptec AHA-2940A Ultra SCSI host adapter>
    <4>scsi : 1 host.
    <6>(scsi0:0:6:0) Synchronous at 20.0 Mbyte/sec, offset 15.
    <4>  Vendor: IBM       Model: DDRS-34560        Rev: S97B
    <4>  Type:   Direct-Access                      ANSI SCSI revision: 02
    <4>Detected scsi disk sda at scsi0, channel 0, id 6, lun 0
    <4>scsi : detected 1 SCSI disk total.
    <4>SCSI device sda: hdwr sector= 512 bytes. Sectors= 8925000 [4357 MB] [4.4
GB]
    <6>3c59x.c:v0.99H 11/17/98 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html
    <6>eth0: 3Com 3c905B Cyclone 100baseTx at 0xdc00,  00:c0:4f:90:6e:54, IRQ 11
    <6>  8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
    <6>  MII transceiver found at address 24, status 786d.
    <6>  MII transceiver found at address 0, status 786d.
    <6>  Enabling bus-master transmits and whole-frame receives.
    <4>Partition check:
    <4> sda: sda1 sda2 sda3
    <4> hda: hda1 hda2 < hda5 hda6 >
    <4>VFS: Mounted root (ext2 filesystem) readonly.
    <4>Freeing unused kernel memory: 56k freed
    <6>dump_open(): dump device opened: 0x803 [sd(8,3)]
    <4>nfs warning: mount version older than kernel
    <1>Unable to handle kernel NULL pointer dereference at virtual address
00000008
    <1>current->tss.cr3 = 00484000, %cr3 = 00484000
    <1>*pde = 00000000
    <4>PANIC: Oops (0002) - software fault
    <4>Registers:
    <4>CPU:    0
    <4>EIP:    0010:[<c01122a1>]
    <4>EFLAGS: 00010246
    <4>eax: c0487fa4   ebx: c0486000   ecx: 00000000   edx: 00000018
    <4>esi: bffffd64   edi: 00000001   ebp: fffffffe   esp: c0487fac
    <4>ds: 0018   es: 0018   ss: 0018
    <4>Process crashdump2 (pid: 879, process nr: 54, stackpage=c0487000)
    <4>Stack: c0486000 bffffd64 00000001 bffffd18 c0487fc4 c01079bc fffffffe
00000000 
    <4>       00000002 bffffd64 00000001 bffffd18 00000061 0000002b 0000002b
00000061 
    <4>       400c2254 00000023 00000202 bffffd04 0000002b 
    <4>Call Trace: [<c01079bc>] 
    <4>Code: 29 05 08 00 00 00 8b 18 89 dd 5b 58 bf 03 00 00 00 89 c6 85 
    <4>Dumping to device 0x803 [sd(8,3)] ...
    <4>Writing dump header ...
    <4>Writing dump pages ...

====================
CURRENT SYSTEM TASKS
====================

    ADDR    UID    PID   PPID  STATE   PRI     FLAGS        MM  NAME
==============================================================================
c0252000      0      0      0      0     0         0  c023aa20  swapper
c009c000      0      1      0      1    20       100  c025b060  init
c039c000      0      2      1      1    20        40  c023aa20  kflushd
c039a000      0      3      1      1    20       840  c023aa20  kpiod
c0398000      0      4      1      1    20       840  c023aa20  kswapd
c5dc4000      1    262      1      1    20       140  c025b2e0  portmap
c5d46000      0    277      1      1    20       140  c025b360  ypbind
c5dfa000      0    284    277      1    20       140  c025b460  ypbind
c5db8000      0    338      1      1    20       140  c025b260  syslogd
c5db4000      0    349      1      1    20       140  c025b3e0  klogd
c501a000      0    363      1      1    20        40  c025b4e0  atd
c512c000      0    377      1      1    20        40  c025b560  crond
c514c000      0    395      1      1    20       140  c025b5e0  inetd
c515a000      0    409      1      1    20       140  c025b660  snmpd
c5232000      0    423      1      1    20        40  c025b6e0  named
c520a000      0    437      1      1    20       140  c025b760  routed
c53ca000      0    451      1      1    20       140  c025b7e0  xntpd
c555e000      0    465      1      1    20       140  c025b860  lpd
c535e000      0    483      1      1    20       140  c025b8e0  rpc.statd
c5684000      0    494      1      1    20        40  c025b960  rpc.rquotad
c562c000      0    505      1      1    20        40  c025b9e0  rpc.mountd
c56c4000      0    529      1      1    20       140  c025bae0  rpc.rstatd
c56bc000      0    543      1      1    20       140  c025ba60  rpc.rusersd
c568c000     99    557      1      1    20        40  c025bb60  rpc.rwalld
c5616000      0    571      1      1    20       140  c025bbe0  rwhod
c5ee8000      0    591      1      1    20       140  c025b1e0  rpc.yppasswdd
c55f2000      0    603      1      1    20       140  c025bce0  amd
c5700000      0    605      1      1    20        40  c023aa20  rpciod
c5714000      0    606      1      1    20        40  c023aa20  lockd
c583c000      0    631      1      1    20       140  c58500c0  automount
c570a000      0    644    395      1    20       100  c025bde0  in.rlogind
c5790000      0    658    644      1    20       100  c025bd60  login
c56d8000      0    659    658      1    20       100  c025bee0  tcsh
c56dc000      0    681      1      1    20       140  c025bf60  sendmail
c5a40000      0    701      1      1    20       140  c5850040  gpm
c571c000      0    715      1      1    20       140  c5850140  httpd
c56ea000     99    718    715      1    20       140  c025be60  httpd
c5ae2000     99    719    715      1    20       140  c58501c0  httpd
c59a4000     99    720    715      1    20       140  c5850240  httpd
c5b6c000     99    721    715      1    20       140  c58502c0  httpd
c5d48000     99    722    715      1    20       140  c5850340  httpd
c5d0a000     99    723    715      1    20       140  c58503c0  httpd
c5cc8000     99    724    715      1    20       140  c5850440  httpd
c5c14000     99    725    715      1    20       140  c58504c0  httpd
c5bcc000     99    726    715      1    20       140  c5850540  httpd
c5ab0000     99    727    715      1    20       140  c58505c0  httpd
c5d96000    100    745      1      1    20        40  c5850740  xfs
c5446000      0    760      1      1    20       140  c5850640  smbd
c05cc000      0    771      1      1    20       140  c58506c0  nmbd
c0a3c000      9    825      1      1    20        40  c5850b40  innd
c5f48000      9    831      1      1    20        40  c58507c0  actived
c5ce2000      0    869      1      1    20       100  c025bc60  mingetty
c08f0000      0    870      1      1    20       100  c025b160  mingetty
c0504000      0    871      1      1    20       100  c5850a40  mingetty
c04b0000      0    872      1      1    20       100  c5850840  mingetty
c0836000      0    873      1      1    20       100  c5850bc0  mingetty
c0940000      0    874      1      1    20       100  c58508c0  mingetty
c0956000      0    875      1      1    20       100  c58509c0  getty
c0b8c000      0    877      1      1    20       140  c5850c40  update
c0486000      0    879    659      0    20         0  c025b0e0  crashdump2

===========================
STACK TRACE OF FAILING TASK
===========================

================================================================
STACK TRACE FOR TASK: 0xc0486000 (crashdump2)

 0 sys_setpriority+41 [0xc01122a1]
 1 system_call+45 [0xc01079b5]
================================================================

Plus, there is online help for each command via the help command. You can issue 
the help command without any arguments (or '?') to see a list of available
lcrash 
commands (note that some of the displayed commands are aliases). 

>> help
?                history          p                stab             
addtypes         i386dis          page             stat             
bt               id               po               strace           
deftask          idis             ps               sym              
dis              md               ptype            symbol           
dt               mktrace          px               t                
dump             mmap             q                task             
findsym          mt               q!               trace            
fsym             namelist         quit             vtop             
h                nmlist           report           whatis           
help             od               sizeof           

You can then issue the help command followed by a command name to see some
information 
about how the command should be used. Here are some examples...

>> help task
COMMAND: task [-f] [-n] [-w outfile] [task list]

    Display relevant information for each entry in task_list. If no entries 
    are specified, display information for all active tasks. Entries in 
    task_list can take the form of a virtual address or a PID (following a 
    '#').

>> task
    ADDR    UID    PID   PPID  STATE   PRI     FLAGS        MM  NAME
==============================================================================
c0252000      0      0      0      0     0         0  c023aa20  swapper
c009c000      0      1      0      1    20       100  c025b060  init
c039c000      0      2      1      1    20        40  c023aa20  kflushd
c039a000      0      3      1      1    20       840  c023aa20  kpiod
.
.
.
c0956000      0    875      1      1    20       100  c58509c0  getty
c0b8c000      0    877      1      1    20       140  c5850c40  update
c0486000      0    879    659      0    20         0  c025b0e0  crashdump2
==============================================================================
60 active task structs found

>> task -f c039c000
    ADDR    UID    PID   PPID  STATE   PRI     FLAGS        MM  NAME
==============================================================================
c039c000      0      2      1      1    20        40  c023aa20  kflushd

TSS:
  ESP0:0xc039e000, ESP:0xc039df7c, EIP:0xc010f38b, EBP:0x0
  EAX:0x0, ECX:0x0, EBX:0x0

==============================================================================
1 active task struct found

>> help trace
COMMAND: trace [-a] [-f] [-w outfile] [[task_list] | [-t tracerec_list]

    Displays a stack trace for each task included in task_list. If task_list 
    is empty and deftask is set, then a stack trace for the default task is 
    displayed. If deftask is not set, then a trace will be displayed for the 
    task running at the time of a system PANIC. If the command is issued with 
    the -t command line option, additional items on the command line will be 
    treated as pointers to lcrash stack trace records (prevously allocated 
    using the mktrace command).

>> t c039c000
================================================================
STACK TRACE FOR TASK: 0xc039c000 (kflushd)

 0 schedule+339 [0xc010f38b]
 1 interruptible_sleep_on+49 [0xc010f6e9]
 2 bdflush+571 [0xc0125aa7]
 3 kernel_thread+33 [0xc0106521]
================================================================

>> t -f  c039c000
================================================================
STACK TRACE FOR TASK: 0xc039c000 (kflushd)

 0 schedule+339 [0xc010f38b]

   RA=0xc010f6ee, SP=0xc039df7c, FP=0xc039dfa4, SIZE=44

   c039df7c: c039dfa0  0000003b  c0252000  c039dfb0  
   c039df8c: 00000286  00000003  0000003b  c0252000  
   c039df9c: c0262000  c039dfb8  c010f6ee  

 1 interruptible_sleep_on+49 [0xc010f6e9]

   RA=0xc0125aac, SP=0xc039dfa8, FP=0xc039dfbc, SIZE=24

   c039dfa8: c17b7380  00003913  c039c000  c023fcec  
   c039dfb8: 000001f4  c0125aac  

 2 bdflush+571 [0xc0125aa7]

   RA=0xc0106523, SP=0xc039dfc0, FP=0xc039dff0, SIZE=52

   c039dfc0: c039c000  00000f00  c009dfcc  c0106000  
   c039dfd0: 00000000  c0106000  c17b70e0  00003b08  
   c039dfe0: 00000008  c039c000  00000003  c17b7380  
   c039dff0: c0106523  

 3 kernel_thread+33 [0xc0106521]

   RA=0x0, SP=0xc039dfc0, FP=0xc039dffc, SIZE=16

   c039dfc0: 00000000  00000f00  c0253fd8  00000000  

================================================================

>> help dump
COMMAND: dump [-d] [-o] [-x] [-B] [-D] [-H] [-W] [-w outfile] addr [count]

    Display count values starting at kernel virtual address addr in one of 
    the following formats: decimal (-d), octal (-o), or hexadecimal (-x).  
    The default format is hexidecimal, and the default count is 1. If addr is 
    preceeded by a pound sign ('#'), it will be treated as a page number 
    (PFN).

>> dump c039dfc0 20
0xc039dfc0: c039c000 00000f00 c009dfcc c0106000 : ..9..........`..
0xc039dfd0: 00000000 c0106000 c17b70e0 00003b08 : .....`...p{..;..
0xc039dfe0: 00000008 c039c000 00000003 c17b7380 : ......9......s{.
0xc039dff0: c0106523 00000000 00000f00 c0253fd8 : #e...........?%.
0xc039e000: 00000000 00000000 00000000 00000000 : ................

>> help dis
COMMAND: dis [-f] [-w outfile] [-F funcname]|addr[count|[bcount acount]]

    Display the disassembled code for addr for count instructions (the 
    default count is 1). Alternately, display the disassembled code for addr 
    with bcount instructions before and acount instructions after. If bcount 
    or acount is zero, then no instructions will be displayed before or after 
    respectively. If the dis command is issued with the -f command line 
    option, additional information will be displayed (opcode and byte size). 
    If the dis command is issued with the -F option followed by funcname, 
    disassembled code will be displayed for all instructions in the function.

>> dis 0xc0106521 3 5
0xc0106521 <kernel_thread+33>:    call   *%edx
0xc0106523 <kernel_thread+35>:    movl   $0x1,%eax
0xc0106528 <kernel_thread+40>:    int    $0x80
0xc010652a <kernel_thread+42>:    movl   %eax,%edx
0xc010652c <kernel_thread+44>:    popl   %ebx
0xc010652d <kernel_thread+45>:    popl   %esi

These are examples of some of the more useful commands (from a debugging point
of view). If you notice any problems with any of the commands, can't figure out
what a particular command is supposed to do, have an idea for a useful command,
etc. please let us know. 

Thanks for taking the time to try these facilities out...

Tom

<Prev in Thread] Current Thread [Next in Thread>