On Fri, Sep 4, 2009 at 3:23 PM, Ingo Molnar<mingo@xxxxxxx> wrote:
>
> * jidong xiao <jidong.xiao@xxxxxxxxx> wrote:
>
>> Hi, Ingo,
>>
>> I am looking the source code of function softlockup_tick()
>>
>> 137 /* Warn about unreasonable delays: */
>> 138 if (now <= (touch_timestamp + softlockup_thresh))
>> 139 return;
>> 140
>> 141 per_cpu(print_timestamp, this_cpu) = touch_timestamp;
>> 142
>> 143 spin_lock(&print_lock);
>> 144 printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for
>> %lus! [%s:%d]\n",
>> 145 this_cpu, now - touch_timestamp,
>> 146 current->comm, task_pid_nr(current));
>> 147 print_modules();
>> 148 print_irqtrace_events(current);
>> 149 if (regs)
>> 150 show_regs(regs);
>> 151 else
>> 152 dump_stack();
>> 153 spin_unlock(&print_lock);
>> 154
>> 155 if (softlockup_panic)
>> 156 panic("softlockup: hung tasks");
>>
>> my kernel is kdb patched kernel, and it looks like if I stay in kdb
>> for more than 60 seconds, I will receive a warning about softlockup
>> when I leave kdb. it is very annoying to see the warning message
>> especially in SMP environment, sometimes this could even hung the
>> machine. A simple way here is:
>>
>> #ifdef CONFIG_KDB
>> not trigger softlockup or set softlockup_thresh to be a very high value.
>> #elseif
>> trigger softlockup if CPU stuck for more than 60 seconds
>> #endif
>>
>> However I feel this approach is far from perfect. Do you have any
>> advice on how to avoid this warning?
>
> Where does kdb spend this time? It's probably polling on something
> (keyboard ports?) - so the right approach would be to fix up the
> touch_timestamp or so.
>
Yes sometime we would stay in kdb for a while, because when you drop
into kdb and you don't type 'go', then you will stay in kdb. During
these time, kdb just polling on the ps2 keyboards/usb keyboards/serial
ports to monitor any inputs.
> I suspect kgdb has similar problems - if you fix it there and if you
> have to touch kernel/softlockup.c for that i can apply those bits
> even though kdb patches are not upstream.
>
> Ingo
>
I took a look at kernel/softlockup.c and saw there is function called
touch_softlockup_watchdog(), I guess this function should be similar
to touch_nmi_watchdog(), right? If they are similar, I mean judging
from users' perpective, then we need not change kernel/softlockup.c,
instead of that, we just need invoke touch_softlockup_watchdog() at
the same places where we call touch_nmi_watchdog() in kdb code.
In addition, I believe kgdb has similar problems like kdb. I can see
touch_nmi_watchdog() is called here and there in kdb code as well as
in kgdb code.
Regards
Jason
|