>This will only correct the case where scheduling is called during
>an interrupt. I have a 2.4.2 patch that will correct that problem
>as well as the dump interrupts case. If you'd like to give that
>a shot (or anyone), let me know. I've been working with this for
>the last two evenings, with some small success. Larry Sendlosky
>also provided a patch for changing when we flush the TLBs, along with
>what we do for disable_local_APIC(). I have to incorporate those
>changes as well (thanks, Larry). The other good thing to do is to
>add in 'int dump_cpu', to indicate which CPU is dumping.
>
>It's basically the disabling of the local APIC, but I decided to
>move all of this into arch/i386/kernel/vmdump.c, instead of
>depending on arch/i386/kernel/{traps,smp}.c for stuff.
>
Sounds good. We would surely like to give it a shot.
>I guess one thing we need to also do is to create some sort of
>GKHI mechanism for allowing people to get a dump and continue
>normal operation.
>
Dprobes today can be used as a way to trigger crash dump, as it lets people
take a dump from within a probe handler. It calls dump_execute directly, so
at the moment a system restart happens after the dump. However we've been
playing around with changing things a little to get it to continue normal
operation after a dump. In this case we only stop the other CPUs
temporarily while the dump is going on. Its just sort of a hack right now -
we've only tried it once on a 2 way system at the moment and it needs some
work ... (I'm not sure if it covers all the possibilities we need to think
of, even besides probes in interrupt handlers). I would like to take a
look at what you and Larry have done for disable_local_APIC ...
GKHI would fit in more as a mechanism for built-in (compiled-in) probe
points (efficient/fast especially when the hooks are not active/enabled).
It could for example be used for having some well-known/fixed trigger
points in the kernel, from where a dump may get triggered if those points
are enabled. (This can be used to customize the kind of events where a dump
would get triggered automatically).
We had in mind the possibility of using GKHI as a hooking mechanism for RAS
facilities, i.e. having some common hooks built into the kernel - as an
alternative to patching the kernel for various facilities independently,
though that's probably a different issue.
>Suparna, do you (and the rest of the IBM folks) need write access
>to the SourceForge tree?
>
Yes, that would be good.
>--Matt
Suparna Bhattacharya
IBM Software Lab, India
E-mail : bsuparna@xxxxxxxxxx
Phone : 91-80-5267117, Extn : 2525
|