On Mon, 17 Sep 2001, Schaal, Richard wrote:
|>Hi Matt,
|>
|>In doing my development and testing, now that the dump recovery seems to be
|>working, I find my disk is filling
|>pretty rapidly because the same dump is recovered more than once. - I dump
|>to a separate device and not to the
|>swap area. I wonder if the dump save step shouldn't set some sort of flag
|>in the dump header on the dump device
|>that would say this is a stale dump, which might need some --force flag in
|>order to "save" it again.
This should always be done. A special flag overwrites something in
the dump header to prevent re-saves from taking place (well, let me
re-state that ... re-saves can take place, but they won't go too
far, they'll stop as soon as they see the dump header's magic number
overwritten).
|>I seem to be dumping ok on my SMP system when I have a relatively simple
|>"oops" to cause a dump for testing, but with increased activity and possible
|>multiple processor panics, the dump is still failing with the console
|>messages pretty well scrambled - apparently messages being intermixed from
|>multiple processors. Here's a sample...
I've never seen anything scrambled like this before. This is
absolutely bizarre. Are you re-directing your console output
or klogd/syslogd to this console? Are you doing anything
special in your kernel builds related to serial consoles?
Is the crash taking place in any console code?
This is very wierd to me. Of course, I don't have an 8P to
test it on, but if you have a spare one available, I'm more
than willing to help. :) Seriously, let me know what you're
doing to crash the system so I can help provide more details.
I've seriously never seen a console this jumbled before.
It's like you've got something redirected in character/raw
mode to the console.
|>Red Hat Linux release 7.1 (Seawolf)
|>Kernel 2.4.8 on an 8-processor i686
|>
|>dopey login: (scsi0:A:1:0): Locking max tag count at 64
|>U<n1a<>1b<l1eU> nUna<t<>b11U>nalb<oa<>1e1U bln> l>UethneoaaUU tnb
|>tnhblaalbnoe
|>de l ltaeehtobaloaon n edkdlth e hhrataonod enad llneee lnl kkhhed a el
|>aereN
|>krknnkndeederlUelnlrn neerLNLe e lUnl k Lp ekeNLoleUiernL lnLpNrtNUUoneee
|>iLlNl
|>LL UL nr tpNLN pdpUeLoroLLi ni tpopniao
|>dneio tntreaLt te ir ndtrtev re e prirdvdo iredee
|>tefnedrriuarertteenfeelcerf ur
|>eaarel fre eefddrdeeeraerrernene aenftnecsccd cereseevdire nre at tca
|>uetvsaa0l
|>it sa0 vr ia00tdvrdt0ua0rut0i0 l0rvt00 ia0ea0rdlu0tdsua s0ard
|>l0 e
|> s0dal s0 pi p0r0:pp
|>0ri0ni0ent0:s0ts0i0ie
|>
|>s0nn0s g0 p 0g0e0i0e 0s00s0p e000:0000
|>0ir
|>00 i
|>pp0n0tr:000ii
|>n0n0
|>
|>c820
|>npg pr ieren i sipen pttrs:iii pnnr0tg0i0i n
|>pg:e0ini nd00*ep0 =00c
|>1001090040619e06010c8
|>0
|>0
|>041908810040g :i p00c
|>i910118:p1
|>144:81
|>*=dppe9c8990*1=eee0 0> *p =
|> ==0dp<*ed0p d10e0 e 0==0 =0>0O20c
|>00
|>000001101
|>104100dppd0*d0ep0e
|>ede = = =* =0000P : C 0*=0 *00p1p00d0e
|>0
|>0>*0=p0d 0d0e0e
|>:
|>
|>Oddly enough, if you take every third or fourth character, you can assemble
|>some of the common
|>error messages. :-)
|>
|>I'll take a look at the panic and dump path to see if there's a window of
|>opportunity for the processors to
|>wander about after a panic.
There is the possibility, but the printk()s shouldn't criss-cross
like this.
|>Regards,
|>Richard
Thanks, Richard. Let me know.
--Matt
|