OK, after wasting some time with memory debugging libraries (dmalloc & electric
fence), I realized kl_init_kern_info() is failing. I'm tracking down the
reason(s) why now. At least this makes sense- I probably haven't yet made all
the necessary modifications for it to work on Alpha.
Hmm, looks like a problem while in cmpreadmem (set cmp_debug=1):
====================================================
(gdb) run
Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash
./map.0 ./vmdump.0
map = ./map.0, vmdump = ./vmdump.0, outfile = stdout
Please wait...__cmploadindex(): cannot open() index file [2]: No such file or
directory!
cmppindexcreate(): Number of pages in dump: 12288
cmppindexcreate(): Page size in dump: 8192
..
Attempting to save index "index.10" ... complete.
...............cmpreadmem(): 8 bytes, 0x584c81 (just a page)
__cmppread(): initiating search for 0x584c81
__cmppindex(): hash = 16472, addr = 0x584c81
__cmppindex(): addr = 0x584c81, tmpptr->addr = 0x584000
__cmppread(): page not found! (0x584c81)
Program exited with code 01.
====================================================
Now, I have set the system to compress the dump on the swap partition
(DUMP_COMPRESS_PAGES=1); however when I look at the generated vmdump.0 file is
does not look like it has been RLE encoded- there are long strings of 0x00
repeating throughout the file. The vmdump.0 is ~69MB, while the real memory
size of the machine is 96MB (#pages*pagesize is also 96MB).
Looks to me like either the dump is getting truncated, or the dump compression
routine has a problem? (DUMP_LEVEL=4, swap partition is 512MB) I am guessing
that sp->s_addr is correct (this is the 0x584c81 being searched for in
cmpreadmem).
Is the vmdump.0 file supposed to be compressed on the disk if
DUMP_COMPRESS_PAGES=1 ?
What is the deal with the "kernel_magic" symbol? I can't find it in the symbol
map for i386 or Alpha. Trying to run lcrash against the running system now
gives:
====================================================
(gdb) run
Starting program: /CDR_UPLOAD/hallb/linux-2.2.13-1.0.3/cmd/lcrash/./lcrash
map = /boot/System.map, vmdump = /dev/mem, outfile = stdout
Please wait................
kernel_magic mismatch of map and memory image
Program exited with code 01.
====================================================
On 27-Apr-2000 Tom Morano wrote:
> rBrian Hall wrote:
>>
>> I haven't changed anything in main(). After the command options are parsed
> out,
>> around main.c:198: (dies in register_cmds() )
>>
>> init_liballoc(0, 0, 0);
>> kl_init_kern_info();
>> register_cmds(cmdset);
>> arch_init(ofp);
>>
>> Are you saying that init_liballoc() needs different arguments now? I
>> followed
>> the call sequence down for init_liballoc, and it appeared that values other
>> than zero were assigned along the way. Changing to
>> init_liballoc(100,100,100)
>> had no effect (same traceback on the segfault). Upping that to 1000 didn't
> help.
>
> The parameters to init_liballoc() are OK. Based on this, I would guess that
> some memory is getting stomped on in or below the kl_init_kern_info()
> function
> call. You might check the block of memory causing the SEGV after returning
> from the init_liballoc() call and before the kl_init_kern_info() call. See if
> it
> looks OK at that point (I would guess the contents of this memory is change
> by
> the time you get to register_cmds()). If that's the case, then walk through
> the
> kl_init_kern_info() function and see where the memory contents changes. From
> looking at the kl_init_kern_info() function, I can't see where the problem
> might
> occur (it basically just does symbol lookups and reads in the contents of
> memory
> into some local variables). Since the Alpha is 64 bit, I assume that the
> amount
> of
> memory being read in for these values is 8 bytes instead of 4 (and that the
> local
> variables, NUM_PHYSPAGES and MEM_MAP have been changed also). Little things
> like
> that might be a factor. Anyway, that's how I would approach narrowing it
> down.
>
> Tom
--
http://www.bigfoot.com/~brihall
Linux Consultant
|