lkcd
[Top] [All Lists]

Re: Alpha lcrash initialization problem - can't access memory

To: Tom Morano <tjm@xxxxxxx>
Subject: Re: Alpha lcrash initialization problem - can't access memory
From: Brian Hall <brianw.hall@xxxxxxxxxx>
Date: Mon, 01 May 2000 12:37:43 -0600 (MDT)
Cc: "Matt D.Robinson" <yakker@xxxxxxxxxxxxxx>, lkcd@xxxxxxxxxxx
In-reply-to: <3908B22A.B11911EB@sgi.com>
Reply-to: Brian Hall <brianw.hall@xxxxxxxxxx>
Sender: owner-lkcd@xxxxxxxxxxx
OK, after wasting some time with memory debugging libraries (dmalloc & electric
fence), I realized kl_init_kern_info() is failing. I'm tracking down the
reason(s) why now. At least this makes sense- I probably haven't yet made all
the necessary modifications for it to work on Alpha.

Hmm, looks like a problem while in cmpreadmem (set cmp_debug=1):

====================================================
(gdb) run
Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash
./map.0 ./vmdump.0
map = ./map.0, vmdump = ./vmdump.0, outfile = stdout
 
Please wait...__cmploadindex(): cannot open() index file [2]: No such file or
directory!
cmppindexcreate(): Number of pages in dump: 12288
cmppindexcreate(): Page size in dump: 8192
..
Attempting to save index "index.10" ... complete.
...............cmpreadmem(): 8 bytes, 0x584c81 (just a page)
__cmppread(): initiating search for 0x584c81
__cmppindex(): hash =  16472, addr = 0x584c81
__cmppindex(): addr = 0x584c81, tmpptr->addr = 0x584000
__cmppread(): page not found! (0x584c81)
 
Program exited with code 01.              
====================================================

Now, I have set the system to compress the dump on the swap partition
(DUMP_COMPRESS_PAGES=1); however when I look at the generated vmdump.0 file is
does not look like it has been RLE encoded- there are long strings of 0x00
repeating throughout the file. The vmdump.0 is ~69MB, while the real memory
size of the machine is 96MB (#pages*pagesize is also 96MB).

Looks to me like either the dump is getting truncated, or the dump compression
routine has a problem? (DUMP_LEVEL=4, swap partition is 512MB) I am guessing
that sp->s_addr is correct (this is the 0x584c81 being searched for in
cmpreadmem).

Is the vmdump.0 file supposed to be compressed on the disk if
DUMP_COMPRESS_PAGES=1 ?

What is the deal with the "kernel_magic" symbol? I can't find it in the symbol
map for i386 or Alpha. Trying to run lcrash against the running system now
gives:

====================================================
(gdb) run
Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash
map = /boot/System.map, vmdump = /dev/mem, outfile = stdout
 
Please wait................
kernel_magic mismatch of map and memory image
 
Program exited with code 01.    
====================================================


On 27-Apr-2000 Tom Morano wrote:
> rBrian Hall wrote:
>> 
>> I haven't changed anything in main(). After the command options are parsed
> out,
>> around main.c:198: (dies in register_cmds() )
>> 
>>         init_liballoc(0, 0, 0);
>>         kl_init_kern_info();
>>         register_cmds(cmdset);
>>         arch_init(ofp);
>> 
>> Are you saying that init_liballoc() needs different arguments now? I
>> followed
>> the call sequence down for init_liballoc, and it appeared that values other
>> than zero were assigned along the way. Changing to
>> init_liballoc(100,100,100)
>> had no effect (same traceback on the segfault). Upping that to 1000 didn't
> help.
> 
> The parameters to init_liballoc() are OK. Based on this, I would guess that
> some memory is getting stomped on in or below the kl_init_kern_info()
> function
> call. You might check the block of memory causing the SEGV after returning
> from the init_liballoc() call and before the kl_init_kern_info() call. See if
> it
> looks OK at that point (I would guess the contents of this memory is change
> by
> the time you get to register_cmds()). If that's the case, then walk through
> the 
> kl_init_kern_info() function and see where the memory contents changes. From 
> looking at the kl_init_kern_info() function, I can't see where the problem
> might 
> occur (it basically just does symbol lookups and reads in the contents of
> memory 
> into some local variables).  Since the Alpha is 64 bit, I assume that the
> amount
> of
> memory being read in for these values is 8 bytes instead of 4 (and that the
> local
> variables, NUM_PHYSPAGES and MEM_MAP have been changed also). Little things
> like
> that might be a factor. Anyway, that's how I would approach narrowing it
> down.
> 
> Tom

-- 
http://www.bigfoot.com/~brihall
Linux Consultant

<Prev in Thread] Current Thread [Next in Thread>