lkcd
[Top] [All Lists]

Re: how to make kernel do system dump ?

To: "Matt D. Robinson" <yakker@xxxxxxxxxxxxxx>
Subject: Re: how to make kernel do system dump ?
From: Andi Kleen <ak@xxxxxxx>
Date: Mon, 6 Nov 2000 21:26:06 +0100
Cc: Andi Kleen <ak@xxxxxxx>, Hari Kannan <hkannan@xxxxxxxxxxx>, Tom Morano <tjm@xxxxxxx>, hiren_mehta@xxxxxxxxxxx, lkcd@xxxxxxxxxxx, mingo@xxxxxxx
In-reply-to: <3A06B0E3.A4481394@alacritech.com>; from yakker@alacritech.com on Mon, Nov 06, 2000 at 05:23:47AM -0800
References: <200011032303.PAA07244@tomb.fsc-usa.com> <3A02F7E4.5BEF806D@alacritech.com> <20001104094622.A10698@gruyere.muc.suse.de> <3A06A5D2.D08F2597@alacritech.com> <20001106204537.A26147@gruyere.muc.suse.de> <3A06B0E3.A4481394@alacritech.com>
Sender: owner-lkcd@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
[put Ingo Molnar as SMP maintainer into cc]

On Mon, Nov 06, 2000 at 05:23:47AM -0800, Matt D. Robinson wrote:
> Andi Kleen wrote:
> > 
> > On Mon, Nov 06, 2000 at 04:36:34AM -0800, Matt D. Robinson wrote:
> > >
> > > Cool.  We figured it was broken behavior -- we'd get messed up stack
> > > pages for some dumps where scheduling took place.  It was my
> > > original understanding that this wouldn't happen, but then, 2.2 has
> > > a number of broken issues.
> > 
> > You should probably only call the kernel dumper after the stop IPI sending
> > has finished, otherwise the other CPUs may still schedule in 2.4
> > 
> > -Andi
> 
> Calling smp_send_stop() isn't sufficient?  I thought that did a
> disable_local_APIC() for each CPU (except the one we're running on),
> then executes the hlt instruction for each of those CPUs.   There doesn't
> seem to be a routine to verify the apic_write_around() call has
> completed or not -- are you referring to something else?

That should be sufficient yes.

But there seems to be a bug -- smp_send_stop() calls smp_call_function() in 
async mode 
which is likely wrong. 

I think this patch is needed (against 2.4.0test10), otherwise there is no 
guarantee
that the other CPUs are really stopped. 

Ingo, what do you think ? 

--- arch/i386/kernel/smp.c-o    Fri Oct 20 18:46:49 2000
+++ arch/i386/kernel/smp.c      Mon Nov  6 21:27:38 2000
@@ -493,7 +493,7 @@
 
 void smp_send_stop(void)
 {
-       smp_call_function(stop_this_cpu, NULL, 1, 0);
+       smp_call_function(stop_this_cpu, NULL, 0, 1);
        smp_num_cpus = 1;
 
        __cli();



-Andi


<Prev in Thread] Current Thread [Next in Thread>