lkcd
[Top] [All Lists]

Re: Alpha lcrash initialization problem - can't access memory

To: Brian Hall <brianw.hall@xxxxxxxxxx>
Subject: Re: Alpha lcrash initialization problem - can't access memory
From: "Matt D. Robinson" <yakker@xxxxxxxxxxxxxx>
Date: Mon, 1 May 2000 11:53:10 -0700 (PDT)
Cc: Tom Morano <tjm@xxxxxxx>, lkcd@xxxxxxxxxxx
In-reply-to: <XFMail.20000501123743.brianw.hall@xxxxxxxxxx>
Sender: owner-lkcd@xxxxxxxxxxx
On Mon, 1 May 2000, Brian Hall wrote:
|>OK, after wasting some time with memory debugging libraries (dmalloc & 
electric
|>fence), I realized kl_init_kern_info() is failing. I'm tracking down the
|>reason(s) why now. At least this makes sense- I probably haven't yet made all
|>the necessary modifications for it to work on Alpha.
|>
|>Hmm, looks like a problem while in cmpreadmem (set cmp_debug=1):
|>
|>====================================================
|>(gdb) run
|>Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash
|>./map.0 ./vmdump.0
|>map = ./map.0, vmdump = ./vmdump.0, outfile = stdout
|> 
|>Please wait...__cmploadindex(): cannot open() index file [2]: No such file or
|>directory!
|>cmppindexcreate(): Number of pages in dump: 12288
|>cmppindexcreate(): Page size in dump: 8192
|>..
|>Attempting to save index "index.10" ... complete.
|>...............cmpreadmem(): 8 bytes, 0x584c81 (just a page)
|>__cmppread(): initiating search for 0x584c81
|>__cmppindex(): hash =  16472, addr = 0x584c81
|>__cmppindex(): addr = 0x584c81, tmpptr->addr = 0x584000
|>__cmppread(): page not found! (0x584c81)
|> 
|>Program exited with code 01.              
|>====================================================
|>
|>Now, I have set the system to compress the dump on the swap partition
|>(DUMP_COMPRESS_PAGES=1); however when I look at the generated vmdump.0 file is
|>does not look like it has been RLE encoded- there are long strings of 0x00
|>repeating throughout the file. The vmdump.0 is ~69MB, while the real memory
|>size of the machine is 96MB (#pages*pagesize is also 96MB).

Looks like it's probably getting compressed based on your data.

|>Looks to me like either the dump is getting truncated, or the dump compression
|>routine has a problem? (DUMP_LEVEL=4, swap partition is 512MB) I am guessing
|>that sp->s_addr is correct (this is the 0x584c81 being searched for in
|>cmpreadmem).

The sp->s_addr is probably correct, but the real question is whether
or not the page is being read in properly.  You'll need to find out if
that page is actually in memory or not (0x584000).  You can turn on
some additional debugging in kl_cmp.c to print out page information from
the page header as the information is dumped out.   Set cmp_debug to 2
and tell me what it says.  If you see all the pages being read, then
you've got what you need from the dump.  If not, then there's a problem
with compression.

I'd typically say it isn't compression as I've used that code for 32 and
64 bit systems in the past (for a few years).  Still, there could be a
problem.

|>Is the vmdump.0 file supposed to be compressed on the disk if
|>DUMP_COMPRESS_PAGES=1 ?

Yes ... /sbin/vmdump config sets /proc/sys/vmdump/dump_compress_pages
to the value based on DUMP_COMPRESS_PAGES.

|>What is the deal with the "kernel_magic" symbol? I can't find it in the symbol
|>map for i386 or Alpha. Trying to run lcrash against the running system now
|>gives:

That should be in the 1.0.4 version ... I don't think it's in the
1.0.3 version.  Look at init/main.c for the value.

|>====================================================
|>(gdb) run
|>Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash
|>map = /boot/System.map, vmdump = /dev/mem, outfile = stdout
|> 
|>Please wait................
|>kernel_magic mismatch of map and memory image
|> 
|>Program exited with code 01.    
|>====================================================
|>
|>
|>On 27-Apr-2000 Tom Morano wrote:
|>> rBrian Hall wrote:
|>>> 
|>>> I haven't changed anything in main(). After the command options are parsed
|>> out,
|>>> around main.c:198: (dies in register_cmds() )
|>>> 
|>>>         init_liballoc(0, 0, 0);
|>>>         kl_init_kern_info();
|>>>         register_cmds(cmdset);
|>>>         arch_init(ofp);
|>>> 
|>>> Are you saying that init_liballoc() needs different arguments now? I
|>>> followed
|>>> the call sequence down for init_liballoc, and it appeared that values other
|>>> than zero were assigned along the way. Changing to
|>>> init_liballoc(100,100,100)
|>>> had no effect (same traceback on the segfault). Upping that to 1000 didn't
|>> help.
|>> 
|>> The parameters to init_liballoc() are OK. Based on this, I would guess that
|>> some memory is getting stomped on in or below the kl_init_kern_info()
|>> function
|>> call. You might check the block of memory causing the SEGV after returning
|>> from the init_liballoc() call and before the kl_init_kern_info() call. See 
if
|>> it
|>> looks OK at that point (I would guess the contents of this memory is change
|>> by
|>> the time you get to register_cmds()). If that's the case, then walk through
|>> the 
|>> kl_init_kern_info() function and see where the memory contents changes. 
From 
|>> looking at the kl_init_kern_info() function, I can't see where the problem
|>> might 
|>> occur (it basically just does symbol lookups and reads in the contents of
|>> memory 
|>> into some local variables).  Since the Alpha is 64 bit, I assume that the
|>> amount
|>> of
|>> memory being read in for these values is 8 bytes instead of 4 (and that the
|>> local
|>> variables, NUM_PHYSPAGES and MEM_MAP have been changed also). Little things
|>> like
|>> that might be a factor. Anyway, that's how I would approach narrowing it
|>> down.
|>> 
|>> Tom
|>
|>-- 
|>http://www.bigfoot.com/~brihall
|>Linux Consultant
|>


<Prev in Thread] Current Thread [Next in Thread>