From kbaidarov@ru.mvista.com Sun Jul 8 14:24:02 2007 Received: with ECARTIS (v1.0.0; list kdb); Sun, 08 Jul 2007 14:24:07 -0700 (PDT) Received: from buildserver.ru.mvista.com (rtsoft3.corbina.net [85.21.88.6] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l68LNxtL015742 for ; Sun, 8 Jul 2007 14:24:01 -0700 Received: from localhost.localdomain (unknown [10.150.0.9]) by buildserver.ru.mvista.com (Postfix) with ESMTP id D7C628810 for ; Mon, 9 Jul 2007 01:59:45 +0500 (SAMST) Date: Mon, 9 Jul 2007 01:04:31 +0400 From: Konstantin Baydarov To: kdb@oss.sgi.com Subject: [PATCH] SW Breakpoint doesn't work after it triggers on non boot CPU Message-ID: <20070709010431.5b79e245@localhost.localdomain> X-Mailer: Claws Mail 2.9.1 (GTK+ 2.10.4; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit X-archive-position: 1223 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: kbaidarov@ru.mvista.com Precedence: bulk X-list: kdb I'v set breakpoint to do_sync, then I'v triggered it, first on BOOT CPU(CPU0), than on CPU1. Here is log: root@192.168.15.15:~# sync Instruction(i) breakpoint #0 at 0xc01787c1 (adjusted) 0xc01787c1 do_sync: int3 Entering kdb (current=0xf7ce8ab0, pid 1718) on processor 0 due to Breakpoint @0xc01787c1 [0]kdb> go root@192.168.15.15:~# root@192.168.15.15:~# root@192.168.15.15:~# taskset -c 1 sync Instruction(i) breakpoint #0 at 0xc01787c1 (adjusted) 0xc01787c1 do_sync: int3 Entering kdb (current=0xf7f81a30, pid 1719) on processor 1 due to Breakpoint @0xc01787c1 [1]kdb> go root@192.168.15.15:~# root@192.168.15.15:~# root@192.168.15.15:~# sync root@192.168.15.15:~# root@192.168.15.15:~# sync root@192.168.15.15:~# Breakpoint on do_sync doesn't work after I've triggered it on the CPU1(taskset-c 1 sync). The reasons of issue are: 1) code that related to install/remove global bp was hardwired to CPU with id 0, don't know why. ... if (!kdb_quiet(reason) || smp_processor_id() == 0) { kdb_bp_install_global(regs); kdbnearsym_cleanup(); debug_kusage(); } ... 2) after single-step over a breakpoint KDB makes itself silent: kdb() ... if (KDB_STATE(GO1)) { kdb_bp_remove_global(); /* They were set for single-step purposes */ KDB_STATE_CLEAR(GO1); reason = KDB_REASON_SILENT; /* Now silently go */ } ... That prevents reinstall of global breakpoints when kernel is leaving kdb: /* * (Re)install the global breakpoints and cleanup the cached * symbol table. This is only done once from the initial * processor on go. */ KDB_DEBUG_STATE("kdb 12", reason); if (!kdb_quiet(reason) || smp_processor_id() == 0) { kdb_bp_install_global(regs); kdbnearsym_cleanup(); debug_kusage(); } How solved: 1) CPU0 code removed. 2) Patch checks if reason is set to KDB_REASON_SILENT and reinstall global breakpoints in that case. 3) Also patch fixes KDB notifier when kernel exits KDB. Patch was tested on x86_64 2 CPU PC with i386 2.6.21 kernel. Thanks. Signed-off-by: Konstantin Baydarov kdb/kdbmain.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: linux-2.6.21/kdb/kdbmain.c =================================================================== --- linux-2.6.21.orig/kdb/kdbmain.c +++ linux-2.6.21/kdb/kdbmain.c @@ -1974,7 +1974,7 @@ kdb(kdb_reason_t reason, int error, stru * Remove the global breakpoints. This is only done * once from the initial processor on initial entry. */ - if (!kdb_quiet(reason) || smp_processor_id() == 0) + if (!kdb_quiet(reason)) kdb_bp_remove_global(); /* @@ -2028,7 +2028,7 @@ kdb(kdb_reason_t reason, int error, stru * processor on go. */ KDB_DEBUG_STATE("kdb 12", reason); - if (!kdb_quiet(reason) || smp_processor_id() == 0) { + if (!kdb_quiet(reason) || reason == KDB_REASON_SILENT) { kdb_bp_install_global(regs); kdbnearsym_cleanup(); debug_kusage(); @@ -2047,7 +2047,7 @@ kdb(kdb_reason_t reason, int error, stru /* Wait until all the other processors leave kdb */ while (kdb_previous_event() != 1) ; - if (!kdb_quiet(reason)) + if (!kdb_quiet(reason) || reason == KDB_REASON_SILENT) notify_die(DIE_KDEBUG_LEAVE, "KDEBUG LEAVE", regs, error, 0, 0); kdb_initial_cpu = -1; /* release kdb control */ KDB_DEBUG_STATE("kdb 13", reason); --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From kbaidarov@ru.mvista.com Mon Jul 9 08:33:54 2007 Received: with ECARTIS (v1.0.0; list kdb); Mon, 09 Jul 2007 08:34:00 -0700 (PDT) Received: from mail.dev.rtsoft.ru (rtsoft2.corbina.net [85.21.88.2] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l69FXptL009868 for ; Mon, 9 Jul 2007 08:33:53 -0700 Received: (qmail 3180 invoked from network); 9 Jul 2007 15:33:53 -0000 Received: from windmill.dev.rtsoft.ru (192.168.1.130) by mail.dev.rtsoft.ru with SMTP; 9 Jul 2007 15:33:53 -0000 Date: Mon, 9 Jul 2007 19:39:16 +0400 From: Konstantin Baydarov To: kdb@oss.sgi.com Subject: [PATCH] fix of unnecessary clocksource change after exiting from KDB console Message-ID: <20070709193916.32bcfdcd@windmill.dev.rtsoft.ru> X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.19; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit X-archive-position: 1224 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: kbaidarov@ru.mvista.com Precedence: bulk X-list: kdb When I spend more that 10 seconds in KDB console and then exit from KDB, Kernel think that current clocksource is unstable and change it. I'm using 2.6.22-rc7 kdb on SMP i386 system. Here is log: Before doing sync, I've set breakpoint to do_sync(). root@192.168.40.10:~# root@192.168.40.10:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc root@192.168.40.10:~# sync Instruction(i) breakpoint #0 at 0xc017b64a (adjusted) 0xc017b64a do_sync: int3 Entering kdb (current=0xc16f3a50, pid 2983) on processor 0 due to Breakpoint @ 0xc017b64a [0]kdb> go Clocksource tsc unstable (delta = 14060902198 ns) root@192.168.40.10:~# Time: acpi_pm clocksource has been installed. root@192.168.40.10:~# root@192.168.40.10:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm root@192.168.40.10:~# root@192.168.40.10:~# Issue: tsc clocksource was replaced by acpi_pm. The reason of issue: Current clocksource(tsc) in kernel have a watchdog - another clocksource(acpi_pm). clocksource_watchdog() that updates watchdog_last timestamp runs with help of kernel timer that is disabled when kernel enters kdb. So watchdog clocksource(acpi_pm) can overflow and when kernel exits kdb, watchdog clocksource can report wrong time delta - that's why kernel can think that current clocksource is unstable and change it. How solved: I suspend/resume timekeeping when we enter/exit kdb. Suspend/resume of timekeeping suspends/resumes current clocksource and watchdog clocksource. Also patch prevents potential softlockup warnings that appear in earlier kernels. Thanks. Signed-off-by: Konstantin Baydarov kdb/kdbmain.c | 17 +++++++++++++++++ kernel/time/timekeeping.c | 27 +++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) Index: linux-2.6.22-rc7/kdb/kdbmain.c =================================================================== --- linux-2.6.22-rc7.orig/kdb/kdbmain.c +++ linux-2.6.22-rc7/kdb/kdbmain.c @@ -47,6 +47,9 @@ #include #include +int kdb_timekeeping_suspend(void); +int kdb_timekeeping_resume(void); + /* * Kernel debugger state flags */ @@ -60,6 +63,7 @@ atomic_t kdb_8250; */ static DEFINE_SPINLOCK(kdb_lock); volatile int kdb_initial_cpu = -1; /* cpu number that owns kdb */ +volatile int kdb_initial_cpu_save = -1; /* cpu number that owns kdb */ int kdb_seqno = 2; /* how many times kdb has been entered */ volatile int kdb_nextline = 1; @@ -1998,6 +2002,11 @@ kdb(kdb_reason_t reason, int error, stru smp_kdb_stop(); KDB_DEBUG_STATE("kdb 8", reason); } + /* Suspend clocksource, when entering kdb, to prevent + * false soft lockup warnings and switching to another + * clocksource. + */ + kdb_timekeeping_suspend(); } if (KDB_STATE(GO1)) { @@ -2020,6 +2029,7 @@ kdb(kdb_reason_t reason, int error, stru if (result == KDB_CMD_GO && KDB_STATE(SSBPT)) KDB_STATE_SET(GO1); + kdb_initial_cpu_save = kdb_initial_cpu; if (smp_processor_id() == kdb_initial_cpu && !KDB_STATE(DOING_SS) && !KDB_STATE(RECURSE)) { @@ -2055,6 +2065,13 @@ kdb(kdb_reason_t reason, int error, stru } } + /* Only do this work if we are really leaving kdb */ + if (!(KDB_STATE(DOING_SS) || KDB_STATE(SSBPT) || KDB_STATE(RECURSE))) { + if(smp_processor_id() == kdb_initial_cpu_save) + /* Resume clocksource when initial cpu leaves kdb */ + kdb_timekeeping_resume(); + } + KDB_DEBUG_STATE("kdb 14", result); kdba_restoreint(&int_state); #ifdef CONFIG_CPU_XSCALE Index: linux-2.6.22-rc7/kernel/time/timekeeping.c =================================================================== --- linux-2.6.22-rc7.orig/kernel/time/timekeeping.c +++ linux-2.6.22-rc7/kernel/time/timekeeping.c @@ -299,6 +299,19 @@ static int timekeeping_resume(struct sys return 0; } +#if defined(CONFIG_KDB) || defined(CONFIG_KDB_MODULE) +int kdb_timekeeping_resume(void) +{ + int ret; + struct sys_device dev; + + ret = timekeeping_resume(&dev); + + return ret; +} +EXPORT_SYMBOL(kdb_timekeeping_resume); +#endif + static int timekeeping_suspend(struct sys_device *dev, pm_message_t state) { unsigned long flags; @@ -313,6 +326,20 @@ static int timekeeping_suspend(struct sy return 0; } +#if defined(CONFIG_KDB) || defined(CONFIG_KDB_MODULE) +int kdb_timekeeping_suspend(void) +{ + int ret; + struct sys_device dev; + pm_message_t state; + + ret = timekeeping_suspend(&dev, state); + + return ret; +} +EXPORT_SYMBOL(kdb_timekeeping_suspend); +#endif + /* sysfs resume/suspend bits for timekeeping */ static struct sysdev_class timekeeping_sysclass = { .resume = timekeeping_resume, --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From kbaidarov@ru.mvista.com Tue Jul 10 09:04:34 2007 Received: with ECARTIS (v1.0.0; list kdb); Tue, 10 Jul 2007 09:04:39 -0700 (PDT) Received: from mail.dev.rtsoft.ru (rtsoft2.corbina.net [85.21.88.2] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l6AG4Vbm028750 for ; Tue, 10 Jul 2007 09:04:33 -0700 Received: (qmail 31810 invoked from network); 10 Jul 2007 16:04:33 -0000 Received: from windmill.dev.rtsoft.ru (192.168.1.130) by mail.dev.rtsoft.ru with SMTP; 10 Jul 2007 16:04:33 -0000 Date: Tue, 10 Jul 2007 20:09:58 +0400 From: Konstantin Baydarov To: kdb@oss.sgi.com Subject: [PATCH] hardware breakpoint doesn't work Message-ID: <20070710200958.382d7a58@windmill.dev.rtsoft.ru> X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.19; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit X-archive-position: 1225 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: kbaidarov@ru.mvista.com Precedence: bulk X-list: kdb If I get to KDB console hitting software breakpoint and then delete this sw breapoint and then set hardware breakpoint to the same function - hw breakpoint doesn't work. Here is log: Before doing sync, I've set breakpoint to do_sync(). root@192.168.40.10:~# sync Instruction(i) breakpoint #0 at 0xc017b64a (adjusted) 0xc017b64a do_sync: int3 Entering kdb (current=0xeffd7a50, pid 2985) on processor 0 due to Breakpoint @ a [0]kdb> bc 0 Breakpoint 0 at 0xc017b64a cleared [0]kdb> bph do_sync Forced Instruction(Register) BP #0 at 0xc017b64a (do_sync) is enabled in dr0 on cpu 0 [0]kdb> go root@192.168.40.10:~# root@192.168.40.10:~# root@192.168.40.10:~# sync root@192.168.40.10:~# root@192.168.40.10:~# The reason of issue: When KDB executes singe-step over breakpoint it forgets to clear pending single step exception flag. And if after that hardware breakpoint is triggered KDB thinks that it's single step and ignores hw breakpoint. How solved: Single-step exception pending flag is cleared when kernel enters to KDB during single-step over breakpoint. Patch against kernel 2.6.22. It fixes i386 and x86_64 kernels. Thanks. Signed-off-by: Konstantin Baydarov arch/i386/kdb/kdba_bp.c | 4 ++++ arch/x86_64/kdb/kdba_bp.c | 4 ++++ 2 files changed, 8 insertions(+) Index: linux-2.6.22/arch/i386/kdb/kdba_bp.c =================================================================== --- linux-2.6.22.orig/arch/i386/kdb/kdba_bp.c +++ linux-2.6.22/arch/i386/kdb/kdba_bp.c @@ -108,6 +108,10 @@ kdba_db_trap(struct pt_regs *regs, int e kdba_installbp(regs, bp); if (!KDB_STATE(DOING_SS)) { regs->eflags &= ~EF_TF; + /* + * Clear the pending exceptions. + */ + kdba_putdr6(0); return(KDB_DB_SSBPT); } break; Index: linux-2.6.22/arch/x86_64/kdb/kdba_bp.c =================================================================== --- linux-2.6.22.orig/arch/x86_64/kdb/kdba_bp.c +++ linux-2.6.22/arch/x86_64/kdb/kdba_bp.c @@ -108,6 +108,10 @@ kdba_db_trap(struct pt_regs *regs, int e kdba_installbp(regs, bp); if (!KDB_STATE(DOING_SS)) { regs->eflags &= ~EF_TF; + /* + * Clear the pending exceptions. + */ + kdba_putdr6(0); return(KDB_DB_SSBPT); } break; --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe. From bwalle@suse.de Thu Jul 26 09:54:19 2007 Received: with ECARTIS (v1.0.0; list kdb); Thu, 26 Jul 2007 09:54:25 -0700 (PDT) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l6QGsGbm004876 for ; Thu, 26 Jul 2007 09:54:18 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 40A3E2158B; Thu, 26 Jul 2007 18:35:46 +0200 (CEST) Date: Thu, 26 Jul 2007 18:35:45 +0200 From: Bernhard Walle To: Keith Owens Cc: Takenori Nagano , kdb@oss.sgi.com Subject: Re: [patch] Fix some problem between kdb and kdump Message-ID: <20070726163545.GA22328@suse.de> References: <465E7B57.9070705@ah.jp.nec.com> <5940.1180658239@kao2.melbourne.sgi.com> MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5940.1180658239@kao2.melbourne.sgi.com> Organization: SUSE LINUX Products GmbH User-Agent: Mutt/1.5.16 (2007-06-09) Content-Transfer-Encoding: 8bit X-archive-position: 1226 X-ecartis-version: Ecartis v1.0.0 Sender: kdb-bounce@oss.sgi.com Errors-to: kdb-bounce@oss.sgi.com X-original-sender: bwalle@suse.de Precedence: bulk X-list: kdb * Keith Owens [2007-06-01 02:37]: > Takenori Nagano (on Thu, 31 May 2007 16:37:59 +0900) wrote: > >Hi, > > > >kdb has some problem to use with kdump. > >This patch fixes some of them. > > > >1) We can't use kdb when machine panicked. > > > >crash_kexec() is called before notifier_call_chain(&panic_notifier_chain)= > >. > >This patch makes KDB_ENTER() is called before crash_kexec(). > > Both KDB and crash_kexec should be using the panic_notifier_chain, with > KDB having a higher priority than crash_exec. The whole point of > notifier chains is to handle cases like this, so we should not be > adding more code to the panic routine. That's true. But the problem is: KDB is not mainline while kdump is. So, if mainline doesn't accept changes for some reason, the changes must be included in KDB patches. :-( > >2) We can't take a kdump when KDB_FLAG is set CATASTROPHIC. > > > >kdb_do_dump() does not support kdump. > >This patch makes machine_kexec() is called from kdb_do_dump(). > > Ugly. All the code for selecting which dump to take (lkcd, kexec, > anything else) should be in a common kernel routine that anybody can > call. It should not be just in KDB. Well, but if KDB is *disabled*, there should be a way that KDB just doesn't do anything. It would help if CONFIG_KDB_CONTINUE_CATASTROPHIC would be a *runtime* setting, not a *compile time* setting. Thanks, Bernhard --------------------------- Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe.