lkcd
[Top] [All Lists]

Re: lkcd & I2c

To: "Matt D. Robinson" <yakker@xxxxxxxxxxxxxx>
Subject: Re: lkcd & I2c
From: Dan Hollis <goemon@xxxxxxxxx>
Date: Fri, 9 Mar 2001 12:49:04 -0800 (PST)
Cc: "Scott, Colin" <colin.scott@xxxxxxxxxx>, "'lkcd@xxxxxxxxxxx'" <lkcd@xxxxxxxxxxx>
In-reply-to: <3AA92704.A2AA397A@xxxxxxxxxxxxxx>
Sender: owner-lkcd@xxxxxxxxxxx
What's lkcd?

-Dan

On Fri, 9 Mar 2001, Matt D. Robinson wrote:

> This sounds like a great idea.  There's a few mechanisms we
> can follow to do this -- either save this in the architecture
> dependent dump header (best location), or save this in the
> standard vmdump process.
>
> I can modify __dump_configure_header() to save this for the
> x86 platforms to start.  Do you have a set of code I can look
> at for saving this stuff?
>
> Oh, and what interrupt level do you need to run at?  Can you
> access the I2C master at the time of the failure to send out
> requests to the I2C slave?
>
> BTW, Tom, this also applies to IA64, since Merced has an I2C
> interface to the chip where you can get status information.
>
> --Matt
>
> "Scott, Colin" wrote:
> >
> > Hi,
> >
> > Are there any plans to have lkcd dump the system hardware and environment
> > status info using the I2c protocol? The current 2.4.x kernels already have
> > I2c code and drivers. Maybe they could be used to get the hardware and
> > environment status info at the exact time of a crash or kernel panic? It
> > would be a good idea to be able to eliminate for example CPU/Chipset
> > overheating problems or memory parity errors as the cause of a crash before
> > starting a painstaking investigation into what might actually be bug free
> > kernel code. This code should be added to lckd because the syslog daemon
> > would probably not get chance to report most hardware errors before the
> > system dumps memory and reboots. I would also like to see ECC memory status
> > messages if bad ECC memory was the cause of the system panic. Michael
> > O'Reilly has written some code to do ECC error reporting to the kernel
> > logfile. Is there any chance that this code could be merged into lkcd? See
> > http://www.anime.net/~goemon/linux-ecc/ for details. The addition of
> > hardware failure reporting code to lkcd could make Linux into a more
> > reliable and more Highly Available UNIX variant by allowing us to identify
> > and replace bad hardware in the system and would take Linux one more step
> > closer to becoming a mature and reliable OS.
> >
> > Colin Scott
> > Senior Technical Specialist
> > Schering-Plough Corp.
> >
> > Disclaimer: This email does not represent the opinions or interests of
> > Schering-Plough Corp.
>


<Prev in Thread] Current Thread [Next in Thread>