Well, As I mentioned, I am trying to debug the panic of my own
scsi hba driver.
The problem is that if my driver panics in the
interrupt thread, then I am not sure whether linux will be able
to do the dump. The reason is because at the time dump through
aic7xxx, I am not sure whether aic7xx will be able to generate
interrupts or not.
Also, if my driver panics after it acquires the
io_request_lock, then to do the I/O, aic7xxx will not be
able to get the io_request_lock.
In fact, I have seen many operating systems doing the dump
by polling mechanism instead of interrupt mechanism. And my
understanding is that aic7xxx driver does not work in polling
mode.
Thanks and regards,
-hiren
> -----Original Message-----
> From: Matt D. Robinson [mailto:yakker@xxxxxxxxxxxxxx]
> Sent: Tuesday, November 07, 2000 9:54 AM
> To: hiren_mehta@xxxxxxxxxxx
> Cc: lkcd@xxxxxxxxxxx
> Subject: Re: dump problem while debugging scsi hba driver
>
>
> hiren_mehta@xxxxxxxxxxx wrote:
> >
> > I am trying to debug a scsi hba driver (this driver is not
> > for AIC7XXX) panic using lkcd. The dump device is on AIC7xxx.
> > Also the /root /usr etc are on AIC7xxx. Now if the scsi hba driver
> > panics, then can the linux dump to the dump device on aic7xxx ?
> >
> > -hiren
>
> If the AIC7xxx driver panics, it's going to be hit and miss as to
> whether you get a dump image or not. The best solution is to go
> through some other disk driver (such as an IDE driver) to dump.
> This especially makes sense if you're debugging your stuff. Let
> me know more specifically what you're doing, and perhaps I can
> offer some more details as to what you might be seeing.
>
> With that said ...
>
> Okay, I'm going to use this as an opportunity to open up a discussion
> on this problem. I'd like to hear people's feedback on what should
> be the right direction for the future. It's important to hear back
> something on this ...
>
> Right now, as of 2.4, we end up calling brw_kiovec() as a mechanism
> for getting our pages out to disk. While this is great and all, it
> is hardly what I call "acceptable" for dumping purposes.
>
> The problem lies in a couple of areas. First, Linus has said that
> he doesn't want raw I/O for various reasons in the kernel. While
> kiobufs are a nice feature, they hardly come close to what I call
> "raw I/O", because they don't get around problems dealing with
> buffer head locks and device driver spinlocks. In addition, Linus
> has also said to me that we shouldn't be going through the standard
> IDE driver when we dump to disk, as he doesn't trust it (his words,
> not mine).
>
> I've dealt with this problem long enough, and it is excruciatingly
> annoying. So where does this leave us, in terms of future
> development?
>
> Here's what I propose, and I'd like to hear from those of you
> out there
> that have an interest in this area.
>
> * I'd like to see us create a separate set of generic disk drivers
> that specifically have the purpose of writing out raw to disk.
> Drivers for IDE and SCSI initially, and then any other driver we
> need after that.
>
> * These drivers can be used for the purpose of writing out raw to
> disk, with the assumption that anyone using them must understand
> they could be clobbering data if writing to a drive where buffered
> I/O is taking place (this should only happen due to coder error,
> where a user tries to use both to the same disk partition). The
> point is they are supposed to be reliable -- speed isn't a huge
> consideration up front.
>
> * I don't want to take the path of adding "features" to the current
> set of drivers, because A) they may not be maintained properly,
> B) they will be burdened down by other opinions as to what raw I/O
> really is, and C) we can't guarantee some type of locking won't be
> thrown into the mix.
>
> The complexities are probably:
>
> 1) Inserting a duplicate driver stream into the kernel;
> 2) Writing small enough yet complete enough drivers to perform basic
> raw I/O tasks (open, read, write, close) without locking;
> 3) Getting this accepted as a standard part of the kernel
> (yes, I know
> Linus is against a kernel debugger, but this isn't a
> kernel debugger,
> and despite how awesome 'lcrash' is, it's a crash dump
> analyzer, not
> a kernel debugger) ... LKCD _needs_ to be part of the kernel. To
> those of us that care about RAS initiatives, it isn't an option.
> And if not LKCD, then something like it.
>
> I'd typically recommend just putting in a 'if (dumping)' mechanism to
> do lock avoidance down through the driver level, but there
> isn't a real
> raw I/O driver to put that in, and the best solution I see is to make
> one. I've explored this, and I've written some stuff up, but I wanted
> to get people's thoughts first before I go running down one path and
> people think we should go down some other path. Andre
> Hedrick showed me
> some taskfile_wait() stuff that can do really low level raw I/O, but
> I'm not sure whether it's something we can use or not.
>
> Can I get people's thoughts, please? I don't ask for much. :)
>
> --Matt
>
|