From jlan@sgi.com Mon Oct 27 18:52:57 2008 Received: with ECARTIS (v1.0.0; list kdb); Mon, 27 Oct 2008 18:53:08 -0700 (PDT) Received: from kluge.engr.sgi.com (kluge.engr.sgi.com [150.166.39.81]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9S1qvfN031140 for ; Mon, 27 Oct 2008 18:52:57 -0700 Received: from [150.166.8.78] (aware.engr.sgi.com [150.166.8.78]) by kluge.engr.sgi.com (SGI-8.12.11.20060308/8.12.11) with ESMTP id m9S1qvb2106636; Mon, 27 Oct 2008 18:52:57 -0700 (PDT) Message-ID: <49067079.9020608@sgi.com> Date: Mon, 27 Oct 2008 18:52:57 -0700 From: Jay Lan User-Agent: Thunderbird 2.0.0.6 (X11/20070801) MIME-Version: 1.0 To: Bernhard Walle CC: KDB Subject: Re: [PATCH] Fix NULL pointer dereference when regs == NULL References: <20081025211848.458e645e@kopernikus.site> In-Reply-To: <20081025211848.458e645e@kopernikus.site> Content-type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-archive-position: 1457 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: jlan@sgi.com Precedence: bulk X-list: kdb Bernhard Walle wrote: > Hi Jay, > > while testing the fix for my last problem, I found another issue. I > think the attached patch is the right fix for it, can you please review > the patch and add it to your patch series in next release? > > > Regards, > Bernhard > Thanks Bernhard, Applied. - jay =================================================================================== This patch fixes following problem: When panic() in user context, for example by # modprobe crasher call_panic then KDB crashed in kdba_getpc() once because regs was not checked for being NULL: Entering kdb (current=0xffff880036c747c0, pid 4420) on processor 1 Oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 IP: [] kdba_getpc+0x0/0x8 PGD 379f4067 PUD 39997067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1c.5/0000:06:00.0/irq kdb: Debugger re-entered on cpu 1, new reason = 5 Not executing a kdb command No longjmp available for recovery Cannot recover, allowing event to proceed Even if that has ieen fixed, then kdba_dumpregs() crashed because the return value of kdba_getpc() was assumed to be non-NULL. This patch simply ports the error handling from its 32 bit counterpart implementation. After applying that fix, the test mentioned above succeeds: Entering kdb (current=0xffff8800355fc480, pid 7564) on processor 1 Oops: due to oops @ 0x0 kdba_dumpregs: pt_regs not available, use bt* or pid to select a different task [1]kdb> Signed-off-by: Bernhard Walle --- arch/x86/kdb/kdbasupport_64.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/arch/x86/kdb/kdbasupport_64.c +++ b/arch/x86/kdb/kdbasupport_64.c @@ -500,6 +500,11 @@ kdba_dumpregs(struct pt_regs *regs, struct kdbregs *rlp; kdb_machreg_t contents; + if (!regs) { + kdb_printf("%s: pt_regs not available, use bt* or pid to select a different task\n", __FUNCTION__); + return KDB_BADREG; + } + for (i=0, rlp=kdbreglist; i --- arch/x86/kdb/kdbasupport_32.c | 36 ++++++++++++++++++++++++++++++++---- arch/x86/kdb/kdbasupport_64.c | 37 +++++++++++++++++++++++++++++++++---- include/asm-x86/irq_vectors.h | 11 ++++++----- 3 files changed, 71 insertions(+), 13 deletions(-) Index: 081002.linus/arch/x86/kdb/kdbasupport_32.c =================================================================== --- 081002.linus.orig/arch/x86/kdb/kdbasupport_32.c +++ 081002.linus/arch/x86/kdb/kdbasupport_32.c @@ -883,9 +883,6 @@ kdba_cpu_up(void) static int __init kdba_arch_init(void) { -#ifdef CONFIG_SMP - set_intr_gate(KDB_VECTOR, kdb_interrupt); -#endif set_intr_gate(KDBENTER_VECTOR, kdb_call); return 0; } @@ -1027,14 +1024,45 @@ kdba_verify_rw(unsigned long addr, size_ #include +gate_desc save_idt[NR_VECTORS]; + +void kdb_takeover_vector(int vector) +{ + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); + set_intr_gate(KDB_VECTOR, kdb_interrupt); + return; +} + +void kdb_giveback_vector(int vector) +{ + native_write_idt_entry(idt_table, vector, &save_idt[vector]); + return; +} + /* When first entering KDB, try a normal IPI. That reduces backtrace problems * on the other cpus. */ void smp_kdb_stop(void) { - if (!KDB_FLAG(NOIPI)) + int count; + int cpu; + + if (!KDB_FLAG(NOIPI)) { + kdb_takeover_vector(KDB_VECTOR); send_IPI_allbutself(KDB_VECTOR); + do { + count = 0; + for_each_possible_cpu(cpu) { + if (cpu_isset(cpu, cpu_online_map)) { + if (!KDB_STATE_CPU(KDB,cpu)) + /* count any cpus NOT in kdb */ + count++; + } + } + } while (count != 0); + kdb_giveback_vector(KDB_VECTOR); + } } /* The normal KDB IPI handler */ Index: 081002.linus/arch/x86/kdb/kdbasupport_64.c =================================================================== --- 081002.linus.orig/arch/x86/kdb/kdbasupport_64.c +++ 081002.linus/arch/x86/kdb/kdbasupport_64.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -900,9 +901,6 @@ kdba_cpu_up(void) static int __init kdba_arch_init(void) { -#ifdef CONFIG_SMP - set_intr_gate(KDB_VECTOR, kdb_interrupt); -#endif set_intr_gate(KDBENTER_VECTOR, kdb_call); return 0; } @@ -976,14 +974,45 @@ kdba_set_current_task(const struct task_ #include +gate_desc save_idt[NR_VECTORS]; + +void kdb_takeover_vector(int vector) +{ + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); + set_intr_gate(KDB_VECTOR, kdb_interrupt); + return; +} + +void kdb_giveback_vector(int vector) +{ + native_write_idt_entry(idt_table, vector, &save_idt[vector]); + return; +} + /* When first entering KDB, try a normal IPI. That reduces backtrace problems * on the other cpus. */ void smp_kdb_stop(void) { - if (!KDB_FLAG(NOIPI)) + int count; + int cpu; + + if (!KDB_FLAG(NOIPI)) { + kdb_takeover_vector(KDB_VECTOR); send_IPI_allbutself(KDB_VECTOR); + do { + count = 0; + for_each_possible_cpu(cpu) { + if (cpu_isset(cpu, cpu_online_map)) { + if (!KDB_STATE_CPU(KDB,cpu)) + /* count any cpus NOT in kdb */ + count++; + } + } + } while (count != 0); + kdb_giveback_vector(KDB_VECTOR); + } } /* The normal KDB IPI handler */ Index: 081002.linus/include/asm-x86/irq_vectors.h =================================================================== --- 081002.linus.orig/include/asm-x86/irq_vectors.h +++ 081002.linus/include/asm-x86/irq_vectors.h @@ -66,7 +66,6 @@ # define RESCHEDULE_VECTOR 0xfc # define CALL_FUNCTION_VECTOR 0xfb # define CALL_FUNCTION_SINGLE_VECTOR 0xfa -#define KDB_VECTOR 0xf9 # define THERMAL_APIC_VECTOR 0xf0 #else @@ -79,10 +78,6 @@ #define THERMAL_APIC_VECTOR 0xfa #define THRESHOLD_APIC_VECTOR 0xf9 #define UV_BAU_MESSAGE 0xf8 -/* Overload KDB_VECTOR with UV_BAU_MESSAGE. By the time the UV hardware is - * ready, we should have moved to a dynamically allocated vector scheme. - */ -#define KDB_VECTOR 0xf8 #define INVALIDATE_TLB_VECTOR_END 0xf7 #define INVALIDATE_TLB_VECTOR_START 0xf0 /* f0-f7 used for TLB flush */ @@ -91,6 +86,12 @@ #endif /* + * KDB_VECTOR will take over vector 0xfe when it is needed, as in theory + * it should not be used anyway. + */ +#define KDB_VECTOR 0xfe + +/* * Local APIC timer IRQ vector is on a different priority level, * to work around the 'lost local interrupt if more than 2 IRQ * sources per level' errata. --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From jlan@sgi.com Wed Oct 29 12:33:01 2008 Received: with ECARTIS (v1.0.0; list kdb); Wed, 29 Oct 2008 12:33:09 -0700 (PDT) Received: from kluge.engr.sgi.com (kluge.engr.sgi.com [150.166.39.81]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9TJWxep017643 for ; Wed, 29 Oct 2008 12:33:01 -0700 Received: from [150.166.8.78] (aware.engr.sgi.com [150.166.8.78]) by kluge.engr.sgi.com (SGI-8.12.11.20060308/8.12.11) with ESMTP id m9TJX0Ck120666; Wed, 29 Oct 2008 12:33:00 -0700 (PDT) Message-ID: <4908BA6B.6030107@sgi.com> Date: Wed, 29 Oct 2008 12:32:59 -0700 From: Jay Lan User-Agent: Thunderbird 2.0.0.6 (X11/20070801) MIME-Version: 1.0 To: KDB Subject: mails to kdb@oss.sgi.com got dropped Content-type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-archive-position: 1460 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: jlan@sgi.com Precedence: bulk X-list: kdb Hi, I just noticed that mails to kdb mailing list has been dropped to the floor. We have had a few discussion on KDB_VECTOR got lost in the past few days. If you have posted to the kdb mailing list this month and have not seen it posted by the mail server, can you send them to me at jlan@sgi.com? I will repost them once we fix the mailing list problem. Thanks! Regards, jay --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From jlan@sgi.com Wed Oct 29 13:40:30 2008 Received: with ECARTIS (v1.0.0; list kdb); Wed, 29 Oct 2008 13:40:39 -0700 (PDT) Received: from kluge.engr.sgi.com (kluge.engr.sgi.com [150.166.39.81]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9TKeUIU021420 for ; Wed, 29 Oct 2008 13:40:30 -0700 Received: from [150.166.8.78] (aware.engr.sgi.com [150.166.8.78]) by kluge.engr.sgi.com (SGI-8.12.11.20060308/8.12.11) with ESMTP id m9TKeU0i121713; Wed, 29 Oct 2008 13:40:30 -0700 (PDT) Message-ID: <4908CA3E.1060001@sgi.com> Date: Wed, 29 Oct 2008 13:40:30 -0700 From: Jay Lan User-Agent: Thunderbird 2.0.0.6 (X11/20070801) MIME-Version: 1.0 To: Cliff Wickman CC: kaos@ocs.com.au, Dimitri Sivanich , kdb@oss.sgi.com Subject: Re: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR]] References: <20081028190806.GB8424@sgi.com> <22339.1225256245@ocs10w> <20081029191008.GA30689@sgi.com> In-Reply-To: <20081029191008.GA30689@sgi.com> Content-type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-archive-position: 1461 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: jlan@sgi.com Precedence: bulk X-list: kdb Cliff Wickman wrote: Hi Cliff, > Hi Keith, > > On Wed, Oct 29, 2008 at 03:57:25PM +1100, Keith Owens wrote: >> However there is a separate problem with your patch. You now wait in >> smp_kdb_stop() until all cpus are in KDB. If any cpu is completely >> hung so it cannot be interrupted then smp_kdb_stop() will never return >> and KDB will now appear to hang. >> >> The existing code avoids this by >> >> kdb() -> smp_kdb_stop() - issue KDB_VECTOR as normal interrupt but do not wait for cpus >> kdb() -> kdba_main_loop() >> kdba_main_loop() -> kdb_save_running() >> kdb_save_running() -> kdb_main_loop() >> kdb_main_loop() -> kdb_wait_for_cpus() >> >> kdb_wait_for_cpus() waits until the other cpus are in KDB. If a cpu >> does not respond to KDB_VECTOR after a few seconds then >> kdb_wait_for_cpus() hits the missing cpus with NMI. >> >> This two step approach (send KDB_VECTOR as normal interrupt, wait then >> send NMI) is used because NMI can be serviced at any time, even when >> the target cpu is in the middle of servicing an interrupt. This can >> result in incomplete register state which leads to broken backtraces. >> IOW, sending NMI first would actually make debugging harder. >> >> Given the above logic, if you are going to take over an existing >> interrupt vector then the vector needs to be acquired near the start of >> kdb() and released near the end of kdb(), and only on the master cpu. >> >> Note: there is no overwhelming need for KDB_VECTOR to have a high >> priority. As long as it is received within a few seconds then all is >> well. > > Thanks for the explanation. I see your point. I will let Keith to comment on the logic of your code, but this patch will cause IA64 compilation to fail because kdb_giveback_vector() is not defined for IA64. Suggestions: 1) change kdb_takeover_vector and kdb_giveback_vector to arch-dependent version of kdba_takeover_vector and kdba_giveback_vector. 2) extern of kdba_giveback_vector should be moved to arch-dependent kdb.h (ie, arch/{ia64,x86}/include/asm/kdb.h.) and the ia64 version to be a dummy define. 3) kdbmain.c should change accordingly. Thanks, jay > > How about if we keep the two step approach, but take over the vector > when we need it, in step one. Then give it back when the step two > wait is over. > (assuming we don't take over a vector needed for the NMI) > > Like this: > > --- > arch/x86/kdb/kdbasupport_32.c | 22 ++++++++++++++++++---- > arch/x86/kdb/kdbasupport_64.c | 23 +++++++++++++++++++---- > include/asm-x86/irq_vectors.h | 11 ++++++----- > include/linux/kdb.h | 1 + > kdb/kdbmain.c | 2 ++ > 5 files changed, 46 insertions(+), 13 deletions(-) > > Index: 081002.linus/arch/x86/kdb/kdbasupport_32.c > =================================================================== > --- 081002.linus.orig/arch/x86/kdb/kdbasupport_32.c > +++ 081002.linus/arch/x86/kdb/kdbasupport_32.c > @@ -883,9 +883,6 @@ kdba_cpu_up(void) > static int __init > kdba_arch_init(void) > { > -#ifdef CONFIG_SMP > - set_intr_gate(KDB_VECTOR, kdb_interrupt); > -#endif > set_intr_gate(KDBENTER_VECTOR, kdb_call); > return 0; > } > @@ -1027,14 +1024,31 @@ kdba_verify_rw(unsigned long addr, size_ > > #include > > +gate_desc save_idt[NR_VECTORS]; > + > +void kdb_takeover_vector(int vector) > +{ > + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); > + set_intr_gate(KDB_VECTOR, kdb_interrupt); > + return; > +} > + > +void kdb_giveback_vector(int vector) > +{ > + native_write_idt_entry(idt_table, vector, &save_idt[vector]); > + return; > +} > + > /* When first entering KDB, try a normal IPI. That reduces backtrace problems > * on the other cpus. > */ > void > smp_kdb_stop(void) > { > - if (!KDB_FLAG(NOIPI)) > + if (!KDB_FLAG(NOIPI)) { > + kdb_takeover_vector(KDB_VECTOR); > send_IPI_allbutself(KDB_VECTOR); > + } > } > > /* The normal KDB IPI handler */ > Index: 081002.linus/arch/x86/kdb/kdbasupport_64.c > =================================================================== > --- 081002.linus.orig/arch/x86/kdb/kdbasupport_64.c > +++ 081002.linus/arch/x86/kdb/kdbasupport_64.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -900,9 +901,6 @@ kdba_cpu_up(void) > static int __init > kdba_arch_init(void) > { > -#ifdef CONFIG_SMP > - set_intr_gate(KDB_VECTOR, kdb_interrupt); > -#endif > set_intr_gate(KDBENTER_VECTOR, kdb_call); > return 0; > } > @@ -976,14 +974,31 @@ kdba_set_current_task(const struct task_ > > #include > > +gate_desc save_idt[NR_VECTORS]; > + > +void kdb_takeover_vector(int vector) > +{ > + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); > + set_intr_gate(KDB_VECTOR, kdb_interrupt); > + return; > +} > + > +void kdb_giveback_vector(int vector) > +{ > + native_write_idt_entry(idt_table, vector, &save_idt[vector]); > + return; > +} > + > /* When first entering KDB, try a normal IPI. That reduces backtrace problems > * on the other cpus. > */ > void > smp_kdb_stop(void) > { > - if (!KDB_FLAG(NOIPI)) > + if (!KDB_FLAG(NOIPI)) { > + kdb_takeover_vector(KDB_VECTOR); > send_IPI_allbutself(KDB_VECTOR); > + } > } > > /* The normal KDB IPI handler */ > Index: 081002.linus/include/asm-x86/irq_vectors.h > =================================================================== > --- 081002.linus.orig/include/asm-x86/irq_vectors.h > +++ 081002.linus/include/asm-x86/irq_vectors.h > @@ -66,7 +66,6 @@ > # define RESCHEDULE_VECTOR 0xfc > # define CALL_FUNCTION_VECTOR 0xfb > # define CALL_FUNCTION_SINGLE_VECTOR 0xfa > -#define KDB_VECTOR 0xf9 > # define THERMAL_APIC_VECTOR 0xf0 > > #else > @@ -79,10 +78,6 @@ > #define THERMAL_APIC_VECTOR 0xfa > #define THRESHOLD_APIC_VECTOR 0xf9 > #define UV_BAU_MESSAGE 0xf8 > -/* Overload KDB_VECTOR with UV_BAU_MESSAGE. By the time the UV hardware is > - * ready, we should have moved to a dynamically allocated vector scheme. > - */ > -#define KDB_VECTOR 0xf8 > #define INVALIDATE_TLB_VECTOR_END 0xf7 > #define INVALIDATE_TLB_VECTOR_START 0xf0 /* f0-f7 used for TLB flush */ > > @@ -91,6 +86,12 @@ > #endif > > /* > + * KDB_VECTOR will take over vector 0xfe when it is needed, as in theory > + * it should not be used anyway. > + */ > +#define KDB_VECTOR 0xfe > + > +/* > * Local APIC timer IRQ vector is on a different priority level, > * to work around the 'lost local interrupt if more than 2 IRQ > * sources per level' errata. > Index: 081002.linus/include/linux/kdb.h > =================================================================== > --- 081002.linus.orig/include/linux/kdb.h > +++ 081002.linus/include/linux/kdb.h > @@ -89,6 +89,7 @@ extern volatile int kdb_flags; /* Glob > > extern void kdb_save_flags(void); > extern void kdb_restore_flags(void); > +extern void kdb_giveback_vector(int); > > #define KDB_FLAG(flag) (kdb_flags & KDB_FLAG_##flag) > #define KDB_FLAG_SET(flag) ((void)(kdb_flags |= KDB_FLAG_##flag)) > Index: 081002.linus/kdb/kdbmain.c > =================================================================== > --- 081002.linus.orig/kdb/kdbmain.c > +++ 081002.linus/kdb/kdbmain.c > @@ -1673,6 +1673,8 @@ kdb_wait_for_cpus(void) > wait == 1 ? " is" : "s are", > wait == 1 ? "its" : "their"); > } > + /* give back the vector we took over in smp_kdb_stop */ > + kdb_giveback_vector(KDB_VECTOR); > #endif /* CONFIG_SMP */ > } > --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From kaos@sgi.com Mon Oct 27 19:47:43 2008 Received: with ECARTIS (v1.0.0; list kdb); Wed, 29 Oct 2008 14:03:09 -0700 (PDT) Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9S2lhWD002068 for ; Mon, 27 Oct 2008 19:47:43 -0700 X-ASG-Debug-ID: 1225162059-723c00450000-sLlkUa X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.ocs.com.au (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B3EE91495563 for ; Mon, 27 Oct 2008 19:47:39 -0700 (PDT) Received: from mail.ocs.com.au (mail.ocs.com.au [202.134.241.204]) by cuda.sgi.com with ESMTP id ojHyvriOe0AXO15O for ; Mon, 27 Oct 2008 19:47:39 -0700 (PDT) Received: from ocs10w.ocs.com.au (unknown [10.8.0.6]) by mail.ocs.com.au (Postfix) with ESMTP id 71B2DE01681; Tue, 28 Oct 2008 13:47:36 +1100 (EST) Received: by ocs10w.ocs.com.au (Postfix, from userid 16331) id 01A07FB8F9; Tue, 28 Oct 2008 13:47:31 +1100 (EST) Received: from ocs10w (localhost [127.0.0.1]) by ocs10w.ocs.com.au (Postfix) with ESMTP id EE1AEFB8F8; Tue, 28 Oct 2008 13:47:31 +1100 (EST) X-Mailer: exmh version 2.7.2 01/07/2005 (debian 1:2.7.2-12) with nmh-1.2 From: Keith Owens To: Jay Lan cc: KDB X-ASG-Orig-Subj: Re: [Fwd: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR] Subject: Re: [Fwd: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR] In-reply-to: Your message of "Mon, 27 Oct 2008 19:00:28 PDT." <4906723C.8090601@sgi.com> Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Date: Tue, 28 Oct 2008 13:47:31 +1100 Message-ID: <28641.1225162051@ocs10w> X-Barracuda-Connect: mail.ocs.com.au[202.134.241.204] X-Barracuda-Start-Time: 1225162062 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0008 1.0000 -2.0161 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.8874 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Content-Transfer-Encoding: 8bit X-archive-position: 1462 X-Approved-By: jlan@sgi.com X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: kaos@sgi.com Precedence: bulk X-list: kdb Jay Lan (on Mon, 27 Oct 2008 19:00:28 -0700) wrote: >From: Cliff Wickman > >Hi Jay, > >Remember the conflict over interrupt vector 0xf8 between KDB and >UV TLB shootdown? > >There didn't seem to be any clean way to resolve that. The dynamic >interrupt vector scheme that's going into the kernel does not allow >assigning the same vector to all cpu's, or specifying a vector priority. >And KDB needs to be higher priority than the timer interrupt, and UV TLB >shootdown should be too. > >But since KDB only needs KDB_VECTOR to ping other cpu's it seems that >it can use one of the system vectors temporarily (not the UV TLB shootdown's >however, as it might be in-progress during the entry to KDB). Dimitri >suggested this approach. > >So I tried commandeering 0xfe (ERROR_APIC_VECTOR). >According to arch/x86/kernel/apic_64.c: > "This interrupt should never happen with our APIC/SMP architecture" > >When a cpu enters kdb, this patch causes it to commandeer that vector, >send the IPI's, and wait till all other cpu's enter kdb. Then restore >the IDT to its previous state. > >It seems to work. I tested on x86_64 and ia32. >What do you think? > >Diffed against 2.6.27-rc8 >Signed-off-by: Cliff Wickman >--- > arch/x86/kdb/kdbasupport_32.c | 36 ++++++++++++++++++++++++++++++++---- > arch/x86/kdb/kdbasupport_64.c | 37 +++++++++++++++++++++++++++++++++---- > include/asm-x86/irq_vectors.h | 11 ++++++----- > 3 files changed, 71 insertions(+), 13 deletions(-) > >Index: 081002.linus/arch/x86/kdb/kdbasupport_32.c >=================================================================== >--- 081002.linus.orig/arch/x86/kdb/kdbasupport_32.c >+++ 081002.linus/arch/x86/kdb/kdbasupport_32.c >@@ -883,9 +883,6 @@ kdba_cpu_up(void) > static int __init > kdba_arch_init(void) > { >-#ifdef CONFIG_SMP >- set_intr_gate(KDB_VECTOR, kdb_interrupt); >-#endif > set_intr_gate(KDBENTER_VECTOR, kdb_call); > return 0; > } >@@ -1027,14 +1024,45 @@ kdba_verify_rw(unsigned long addr, size_ > > #include > >+gate_desc save_idt[NR_VECTORS]; >+ >+void kdb_takeover_vector(int vector) >+{ >+ memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); >+ set_intr_gate(KDB_VECTOR, kdb_interrupt); >+ return; >+} >+ >+void kdb_giveback_vector(int vector) >+{ >+ native_write_idt_entry(idt_table, vector, &save_idt[vector]); >+ return; >+} >+ > /* When first entering KDB, try a normal IPI. That reduces backtrace problems > * on the other cpus. > */ > void > smp_kdb_stop(void) > { >- if (!KDB_FLAG(NOIPI)) >+ int count; >+ int cpu; >+ >+ if (!KDB_FLAG(NOIPI)) { >+ kdb_takeover_vector(KDB_VECTOR); > send_IPI_allbutself(KDB_VECTOR); >+ do { >+ count = 0; >+ for_each_possible_cpu(cpu) { >+ if (cpu_isset(cpu, cpu_online_map)) { >+ if (!KDB_STATE_CPU(KDB,cpu)) >+ /* count any cpus NOT in kdb */ >+ count++; >+ } >+ } >+ } while (count != 0); >+ kdb_giveback_vector(KDB_VECTOR); >+ } > } > > /* The normal KDB IPI handler */ I don't see how this can work. kdba_arch_init() currently maps KDB_VECTOR to kdb_interrupt() on each cpu as that cpu is brought up. IOW, KDB_VECTOR ends up being defined on ALL cpus before any KDB interrupt is sent. Your kdb_takeover_vector() only maps KDB_VECTOR to kdb_interrupt() on the CURRENT cpu, then sends KDB_VECTOR to all the other cpus. How do the other cpus know what to do with KDB_VECTOR when they receive it? They will have no definition for KDB_VECTOR so they will receive an unexpected interrupt. --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From kaos@sgi.com Tue Oct 28 21:57:53 2008 Received: with ECARTIS (v1.0.0; list kdb); Wed, 29 Oct 2008 14:03:44 -0700 (PDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9T4vq1O005715 for ; Tue, 28 Oct 2008 21:57:52 -0700 X-ASG-Debug-ID: 1225256270-716a01180000-sLlkUa X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.ocs.com.au (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 14247556CC2 for ; Tue, 28 Oct 2008 21:57:50 -0700 (PDT) Received: from mail.ocs.com.au (mail.ocs.com.au [202.134.241.204]) by cuda.sgi.com with ESMTP id TdToX1CmklIrxzx7 for ; Tue, 28 Oct 2008 21:57:50 -0700 (PDT) Received: from ocs10w.ocs.com.au (unknown [10.8.0.6]) by mail.ocs.com.au (Postfix) with ESMTP id 0A8F7E08BD9; Wed, 29 Oct 2008 15:57:49 +1100 (EST) Received: by ocs10w.ocs.com.au (Postfix, from userid 16331) id 3EF75FB8F9; Wed, 29 Oct 2008 15:57:26 +1100 (EST) Received: from ocs10w (localhost [127.0.0.1]) by ocs10w.ocs.com.au (Postfix) with ESMTP id 2365DFB8F8; Wed, 29 Oct 2008 15:57:26 +1100 (EST) X-Mailer: exmh version 2.7.2 01/07/2005 (debian 1:2.7.2-12) with nmh-1.2 From: Keith Owens To: Cliff Wickman cc: Jay Lan , Dimitri Sivanich , kdb@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR]] Subject: Re: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR]] In-reply-to: Your message of "Tue, 28 Oct 2008 14:08:06 CDT." <20081028190806.GB8424@sgi.com> Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Date: Wed, 29 Oct 2008 15:57:25 +1100 Message-ID: <22339.1225256245@ocs10w> X-Barracuda-Connect: mail.ocs.com.au[202.134.241.204] X-Barracuda-Start-Time: 1225256273 X-Barracuda-Bayes: INNOCENT GLOBAL 0.1455 1.0000 -1.1269 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.13 X-Barracuda-Spam-Status: No, SCORE=-1.13 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.8951 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Content-Transfer-Encoding: 8bit X-archive-position: 1463 X-Approved-By: jlan@sgi.com X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: kaos@sgi.com Precedence: bulk X-list: kdb Cliff Wickman (on Tue, 28 Oct 2008 14:08:06 -0500) wrote: >Hi Jay, Keith, Dmitri, > >> Keith wrote: >> I don't see how this can work. kdba_arch_init() currently maps >> KDB_VECTOR to kdb_interrupt() on each cpu as that cpu is brought up. >> IOW, KDB_VECTOR ends up being defined on ALL cpus before any KDB >> interrupt is sent. >> >> Your kdb_takeover_vector() only maps KDB_VECTOR to kdb_interrupt() on >> the CURRENT cpu, then sends KDB_VECTOR to all the other cpus. How do >> the other cpus know what to do with KDB_VECTOR when they receive it? >> They will have no definition for KDB_VECTOR so they will receive an >> unexpected interrupt. > >When a vector arrives at any cpu it's going to index into the idt_table >for a handler. >There's only one idt_table, shared by all cpu's. >Am I missing something fundamental? My mistake, I was thinking that setting an interrupt gate actually modified a per-cpu register - wrong. However there is a separate problem with your patch. You now wait in smp_kdb_stop() until all cpus are in KDB. If any cpu is completely hung so it cannot be interrupted then smp_kdb_stop() will never return and KDB will now appear to hang. The existing code avoids this by kdb() -> smp_kdb_stop() - issue KDB_VECTOR as normal interrupt but do not wait for cpus kdb() -> kdba_main_loop() kdba_main_loop() -> kdb_save_running() kdb_save_running() -> kdb_main_loop() kdb_main_loop() -> kdb_wait_for_cpus() kdb_wait_for_cpus() waits until the other cpus are in KDB. If a cpu does not respond to KDB_VECTOR after a few seconds then kdb_wait_for_cpus() hits the missing cpus with NMI. This two step approach (send KDB_VECTOR as normal interrupt, wait then send NMI) is used because NMI can be serviced at any time, even when the target cpu is in the middle of servicing an interrupt. This can result in incomplete register state which leads to broken backtraces. IOW, sending NMI first would actually make debugging harder. Given the above logic, if you are going to take over an existing interrupt vector then the vector needs to be acquired near the start of kdb() and released near the end of kdb(), and only on the master cpu. Note: there is no overwhelming need for KDB_VECTOR to have a high priority. As long as it is received within a few seconds then all is well. --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From cpw@sgi.com Wed Oct 29 12:09:02 2008 Received: with ECARTIS (v1.0.0; list kdb); Wed, 29 Oct 2008 14:04:09 -0700 (PDT) Received: from relay.sgi.com (relay1.corp.sgi.com [192.26.58.214]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9TJ92Fh010148 for ; Wed, 29 Oct 2008 12:09:02 -0700 Received: from estes.americas.sgi.com (estes.americas.sgi.com [128.162.236.10]) by relay1.corp.sgi.com (Postfix) with ESMTP id 4C82C8F8073 for ; Wed, 29 Oct 2008 12:09:00 -0700 (PDT) Received: from eag09.americas.sgi.com (eag09.americas.sgi.com [128.162.232.15]) by estes.americas.sgi.com (Postfix) with ESMTP id 0A0297000103; Wed, 29 Oct 2008 14:09:00 -0500 (CDT) Received: from cpw by eag09.americas.sgi.com with local (Exim 4.69) (envelope-from ) id 1KvGQi-00080y-Q6; Wed, 29 Oct 2008 14:10:08 -0500 Date: Wed, 29 Oct 2008 14:10:08 -0500 From: Cliff Wickman To: kaos@ocs.com.au Cc: Jay Lan , Dimitri Sivanich , kdb@oss.sgi.com Subject: Re: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR]] Message-ID: <20081029191008.GA30689@sgi.com> References: <20081028190806.GB8424@sgi.com> <22339.1225256245@ocs10w> MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <22339.1225256245@ocs10w> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Content-Transfer-Encoding: 8bit X-archive-position: 1464 X-Approved-By: jlan@sgi.com X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: cpw@sgi.com Precedence: bulk X-list: kdb Hi Keith, On Wed, Oct 29, 2008 at 03:57:25PM +1100, Keith Owens wrote: > However there is a separate problem with your patch. You now wait in > smp_kdb_stop() until all cpus are in KDB. If any cpu is completely > hung so it cannot be interrupted then smp_kdb_stop() will never return > and KDB will now appear to hang. > > The existing code avoids this by > > kdb() -> smp_kdb_stop() - issue KDB_VECTOR as normal interrupt but do not wait for cpus > kdb() -> kdba_main_loop() > kdba_main_loop() -> kdb_save_running() > kdb_save_running() -> kdb_main_loop() > kdb_main_loop() -> kdb_wait_for_cpus() > > kdb_wait_for_cpus() waits until the other cpus are in KDB. If a cpu > does not respond to KDB_VECTOR after a few seconds then > kdb_wait_for_cpus() hits the missing cpus with NMI. > > This two step approach (send KDB_VECTOR as normal interrupt, wait then > send NMI) is used because NMI can be serviced at any time, even when > the target cpu is in the middle of servicing an interrupt. This can > result in incomplete register state which leads to broken backtraces. > IOW, sending NMI first would actually make debugging harder. > > Given the above logic, if you are going to take over an existing > interrupt vector then the vector needs to be acquired near the start of > kdb() and released near the end of kdb(), and only on the master cpu. > > Note: there is no overwhelming need for KDB_VECTOR to have a high > priority. As long as it is received within a few seconds then all is > well. Thanks for the explanation. I see your point. How about if we keep the two step approach, but take over the vector when we need it, in step one. Then give it back when the step two wait is over. (assuming we don't take over a vector needed for the NMI) Like this: --- arch/x86/kdb/kdbasupport_32.c | 22 ++++++++++++++++++---- arch/x86/kdb/kdbasupport_64.c | 23 +++++++++++++++++++---- include/asm-x86/irq_vectors.h | 11 ++++++----- include/linux/kdb.h | 1 + kdb/kdbmain.c | 2 ++ 5 files changed, 46 insertions(+), 13 deletions(-) Index: 081002.linus/arch/x86/kdb/kdbasupport_32.c =================================================================== --- 081002.linus.orig/arch/x86/kdb/kdbasupport_32.c +++ 081002.linus/arch/x86/kdb/kdbasupport_32.c @@ -883,9 +883,6 @@ kdba_cpu_up(void) static int __init kdba_arch_init(void) { -#ifdef CONFIG_SMP - set_intr_gate(KDB_VECTOR, kdb_interrupt); -#endif set_intr_gate(KDBENTER_VECTOR, kdb_call); return 0; } @@ -1027,14 +1024,31 @@ kdba_verify_rw(unsigned long addr, size_ #include +gate_desc save_idt[NR_VECTORS]; + +void kdb_takeover_vector(int vector) +{ + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); + set_intr_gate(KDB_VECTOR, kdb_interrupt); + return; +} + +void kdb_giveback_vector(int vector) +{ + native_write_idt_entry(idt_table, vector, &save_idt[vector]); + return; +} + /* When first entering KDB, try a normal IPI. That reduces backtrace problems * on the other cpus. */ void smp_kdb_stop(void) { - if (!KDB_FLAG(NOIPI)) + if (!KDB_FLAG(NOIPI)) { + kdb_takeover_vector(KDB_VECTOR); send_IPI_allbutself(KDB_VECTOR); + } } /* The normal KDB IPI handler */ Index: 081002.linus/arch/x86/kdb/kdbasupport_64.c =================================================================== --- 081002.linus.orig/arch/x86/kdb/kdbasupport_64.c +++ 081002.linus/arch/x86/kdb/kdbasupport_64.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -900,9 +901,6 @@ kdba_cpu_up(void) static int __init kdba_arch_init(void) { -#ifdef CONFIG_SMP - set_intr_gate(KDB_VECTOR, kdb_interrupt); -#endif set_intr_gate(KDBENTER_VECTOR, kdb_call); return 0; } @@ -976,14 +974,31 @@ kdba_set_current_task(const struct task_ #include +gate_desc save_idt[NR_VECTORS]; + +void kdb_takeover_vector(int vector) +{ + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); + set_intr_gate(KDB_VECTOR, kdb_interrupt); + return; +} + +void kdb_giveback_vector(int vector) +{ + native_write_idt_entry(idt_table, vector, &save_idt[vector]); + return; +} + /* When first entering KDB, try a normal IPI. That reduces backtrace problems * on the other cpus. */ void smp_kdb_stop(void) { - if (!KDB_FLAG(NOIPI)) + if (!KDB_FLAG(NOIPI)) { + kdb_takeover_vector(KDB_VECTOR); send_IPI_allbutself(KDB_VECTOR); + } } /* The normal KDB IPI handler */ Index: 081002.linus/include/asm-x86/irq_vectors.h =================================================================== --- 081002.linus.orig/include/asm-x86/irq_vectors.h +++ 081002.linus/include/asm-x86/irq_vectors.h @@ -66,7 +66,6 @@ # define RESCHEDULE_VECTOR 0xfc # define CALL_FUNCTION_VECTOR 0xfb # define CALL_FUNCTION_SINGLE_VECTOR 0xfa -#define KDB_VECTOR 0xf9 # define THERMAL_APIC_VECTOR 0xf0 #else @@ -79,10 +78,6 @@ #define THERMAL_APIC_VECTOR 0xfa #define THRESHOLD_APIC_VECTOR 0xf9 #define UV_BAU_MESSAGE 0xf8 -/* Overload KDB_VECTOR with UV_BAU_MESSAGE. By the time the UV hardware is - * ready, we should have moved to a dynamically allocated vector scheme. - */ -#define KDB_VECTOR 0xf8 #define INVALIDATE_TLB_VECTOR_END 0xf7 #define INVALIDATE_TLB_VECTOR_START 0xf0 /* f0-f7 used for TLB flush */ @@ -91,6 +86,12 @@ #endif /* + * KDB_VECTOR will take over vector 0xfe when it is needed, as in theory + * it should not be used anyway. + */ +#define KDB_VECTOR 0xfe + +/* * Local APIC timer IRQ vector is on a different priority level, * to work around the 'lost local interrupt if more than 2 IRQ * sources per level' errata. Index: 081002.linus/include/linux/kdb.h =================================================================== --- 081002.linus.orig/include/linux/kdb.h +++ 081002.linus/include/linux/kdb.h @@ -89,6 +89,7 @@ extern volatile int kdb_flags; /* Glob extern void kdb_save_flags(void); extern void kdb_restore_flags(void); +extern void kdb_giveback_vector(int); #define KDB_FLAG(flag) (kdb_flags & KDB_FLAG_##flag) #define KDB_FLAG_SET(flag) ((void)(kdb_flags |= KDB_FLAG_##flag)) Index: 081002.linus/kdb/kdbmain.c =================================================================== --- 081002.linus.orig/kdb/kdbmain.c +++ 081002.linus/kdb/kdbmain.c @@ -1673,6 +1673,8 @@ kdb_wait_for_cpus(void) wait == 1 ? " is" : "s are", wait == 1 ? "its" : "their"); } + /* give back the vector we took over in smp_kdb_stop */ + kdb_giveback_vector(KDB_VECTOR); #endif /* CONFIG_SMP */ } -- Cliff Wickman Silicon Graphics, Inc. cpw@sgi.com (651) 683-3824 --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From jlan@sgi.com Wed Oct 29 17:37:12 2008 Received: with ECARTIS (v1.0.0; list kdb); Wed, 29 Oct 2008 17:37:22 -0700 (PDT) Received: from kluge.engr.sgi.com (kluge.engr.sgi.com [150.166.39.81]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m9U0bCLn009240 for ; Wed, 29 Oct 2008 17:37:12 -0700 Received: from [150.166.8.78] (aware.engr.sgi.com [150.166.8.78]) by kluge.engr.sgi.com (SGI-8.12.11.20060308/8.12.11) with ESMTP id m9U0bCx4122833; Wed, 29 Oct 2008 17:37:12 -0700 (PDT) Message-ID: <490901B8.7050805@sgi.com> Date: Wed, 29 Oct 2008 17:37:12 -0700 From: Jay Lan User-Agent: Thunderbird 2.0.0.6 (X11/20070801) MIME-Version: 1.0 To: Cliff Wickman CC: kaos@ocs.com.au, Dimitri Sivanich , kdb@oss.sgi.com Subject: Re: [PATCH] KDB: commandeer vector 0xfe for KDB_VECTOR]] References: <20081028190806.GB8424@sgi.com> <22339.1225256245@ocs10w> <20081029191008.GA30689@sgi.com> <4908CA3E.1060001@sgi.com> In-Reply-To: <4908CA3E.1060001@sgi.com> Content-type: text/plain Content-Transfer-Encoding: 8bit X-archive-position: 1465 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: jlan@sgi.com Precedence: bulk X-list: kdb Jay Lan wrote: > > I will let Keith to comment on the logic of your code, but this patch > will cause IA64 compilation to fail because kdb_giveback_vector() > is not defined for IA64. > > Suggestions: > 1) change kdb_takeover_vector and kdb_giveback_vector to arch-dependent > version of kdba_takeover_vector and kdba_giveback_vector. > 2) extern of kdba_giveback_vector should be moved to arch-dependent > kdb.h (ie, arch/{ia64,x86}/include/asm/kdb.h.) and the ia64 version > to be a dummy define. > 3) kdbmain.c should change accordingly. Hi Cliff, Attached below is a revised patch to change your patch with my suggestions above. Thanks, jay -- Attached file included as plaintext by Ecartis -- -- File: KDB_VECTOR.v2.1 Revised Cliff's KDB_VECTOR patch version 2 to make kdba_takeover_vector and kdba_giveback_vector arch dependent. --- arch/ia64/include/asm/kdb.h | 2 ++ arch/x86/kdb/kdbasupport_32.c | 22 ++++++++++++++++++---- arch/x86/kdb/kdbasupport_64.c | 23 +++++++++++++++++++---- include/asm-x86/irq_vectors.h | 11 ++++++----- include/asm-x86/kdb.h | 2 ++ kdb/kdbmain.c | 2 ++ 6 files changed, 49 insertions(+), 13 deletions(-) Index: linux/arch/x86/kdb/kdbasupport_32.c =================================================================== --- linux.orig/arch/x86/kdb/kdbasupport_32.c 2008-10-29 17:19:18.000000000 -0700 +++ linux/arch/x86/kdb/kdbasupport_32.c 2008-10-29 17:21:15.944621636 -0700 @@ -883,9 +883,6 @@ kdba_cpu_up(void) static int __init kdba_arch_init(void) { -#ifdef CONFIG_SMP - set_intr_gate(KDB_VECTOR, kdb_interrupt); -#endif set_intr_gate(KDBENTER_VECTOR, kdb_call); return 0; } @@ -1027,14 +1024,31 @@ kdba_verify_rw(unsigned long addr, size_ #include +gate_desc save_idt[NR_VECTORS]; + +void kdba_takeover_vector(int vector) +{ + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); + set_intr_gate(KDB_VECTOR, kdb_interrupt); + return; +} + +void kdba_giveback_vector(int vector) +{ + native_write_idt_entry(idt_table, vector, &save_idt[vector]); + return; +} + /* When first entering KDB, try a normal IPI. That reduces backtrace problems * on the other cpus. */ void smp_kdb_stop(void) { - if (!KDB_FLAG(NOIPI)) + if (!KDB_FLAG(NOIPI)) { + kdba_takeover_vector(KDB_VECTOR); send_IPI_allbutself(KDB_VECTOR); + } } /* The normal KDB IPI handler */ Index: linux/arch/x86/kdb/kdbasupport_64.c =================================================================== --- linux.orig/arch/x86/kdb/kdbasupport_64.c 2008-10-29 17:19:18.000000000 -0700 +++ linux/arch/x86/kdb/kdbasupport_64.c 2008-10-29 17:21:15.952621783 -0700 @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -900,9 +901,6 @@ kdba_cpu_up(void) static int __init kdba_arch_init(void) { -#ifdef CONFIG_SMP - set_intr_gate(KDB_VECTOR, kdb_interrupt); -#endif set_intr_gate(KDBENTER_VECTOR, kdb_call); return 0; } @@ -976,14 +974,31 @@ kdba_set_current_task(const struct task_ #include +gate_desc save_idt[NR_VECTORS]; + +void kdba_takeover_vector(int vector) +{ + memcpy(&save_idt[vector], &idt_table[vector], sizeof(gate_desc)); + set_intr_gate(KDB_VECTOR, kdb_interrupt); + return; +} + +void kdba_giveback_vector(int vector) +{ + native_write_idt_entry(idt_table, vector, &save_idt[vector]); + return; +} + /* When first entering KDB, try a normal IPI. That reduces backtrace problems * on the other cpus. */ void smp_kdb_stop(void) { - if (!KDB_FLAG(NOIPI)) + if (!KDB_FLAG(NOIPI)) { + kdba_takeover_vector(KDB_VECTOR); send_IPI_allbutself(KDB_VECTOR); + } } /* The normal KDB IPI handler */ Index: linux/include/asm-x86/irq_vectors.h =================================================================== --- linux.orig/include/asm-x86/irq_vectors.h 2008-10-29 17:19:18.000000000 -0700 +++ linux/include/asm-x86/irq_vectors.h 2008-10-29 17:22:32.050027567 -0700 @@ -66,7 +66,6 @@ # define RESCHEDULE_VECTOR 0xfc # define CALL_FUNCTION_VECTOR 0xfb # define CALL_FUNCTION_SINGLE_VECTOR 0xfa -#define KDB_VECTOR 0xf9 # define THERMAL_APIC_VECTOR 0xf0 #else @@ -79,10 +78,6 @@ #define THERMAL_APIC_VECTOR 0xfa #define THRESHOLD_APIC_VECTOR 0xf9 #define UV_BAU_MESSAGE 0xf8 -/* Overload KDB_VECTOR with UV_BAU_MESSAGE. By the time the UV hardware is - * ready, we should have moved to a dynamically allocated vector scheme. - */ -#define KDB_VECTOR 0xf8 #define INVALIDATE_TLB_VECTOR_END 0xf7 #define INVALIDATE_TLB_VECTOR_START 0xf0 /* f0-f7 used for TLB flush */ @@ -91,6 +86,12 @@ #endif /* + * KDB_VECTOR will take over vector 0xfe when it is needed, as in theory + * it should not be used anyway. + */ +#define KDB_VECTOR 0xfe + +/* * Local APIC timer IRQ vector is on a different priority level, * to work around the 'lost local interrupt if more than 2 IRQ * sources per level' errata. Index: linux/kdb/kdbmain.c =================================================================== --- linux.orig/kdb/kdbmain.c 2008-10-29 17:19:18.000000000 -0700 +++ linux/kdb/kdbmain.c 2008-10-29 17:21:15.988622448 -0700 @@ -1666,6 +1666,8 @@ kdb_wait_for_cpus(void) wait == 1 ? " is" : "s are", wait == 1 ? "its" : "their"); } + /* give back the vector we took over in smp_kdb_stop */ + kdba_giveback_vector(KDB_VECTOR); #endif /* CONFIG_SMP */ } Index: linux/arch/ia64/include/asm/kdb.h =================================================================== --- linux.orig/arch/ia64/include/asm/kdb.h 2008-10-28 17:17:26.000000000 -0700 +++ linux/arch/ia64/include/asm/kdb.h 2008-10-29 17:21:16.008622818 -0700 @@ -43,4 +43,6 @@ kdba_funcptr_value(void *fp) return *(unsigned long *)fp; } +#define kdba_giveback_vector(vector) (0) + #endif /* !_ASM_KDB_H */ Index: linux/include/asm-x86/kdb.h =================================================================== --- linux.orig/include/asm-x86/kdb.h 2008-10-28 17:17:27.000000000 -0700 +++ linux/include/asm-x86/kdb.h 2008-10-29 17:21:16.028623187 -0700 @@ -133,4 +133,6 @@ kdba_funcptr_value(void *fp) return (unsigned long)fp; } +extern void kdba_giveback_vector(int); + #endif /* !_ASM_KDB_H */ --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe.