10/01/02 21:45, Stephen Lord wrote:
> Pascal Haakmat wrote:
>
> >10/01/02 21:10, Stephen Lord wrote:
> >
> >>Pascal Haakmat wrote:
> >>
> >>>10/01/02 16:36, Steve Lord wrote:
[snip]
> >>I don't think fs corruption would have much to do with this one, it is a
> >>purely in memory
> >>circular list. So far as I can see it is always manipulated under the
> >>correct locking. I have
> >>a box running a debug kernel sitting in a loop doing the test which
> >>Adrian says makes
> >>this happen for him. It has been going for a few hours, so far no problems.
> >>
> >
> >Well, I've been doing the same, and after 68 iterations of his script I got
> >this pair of messages, repeating every three seconds or so (no Oops or
> >anything else):
> >
> >ide_dmaproc: chipset supported ide_dma_lostirq func only: 13
> >hdc: lost interrupt
> >
> >Looks like a kernel problem or bad hardware?
> >
> >>Would you be willing turn on kdb? It only really makes sense if you are
> >>able to setup
> >>a serial console. There is a debugger command which will walk the
> >>complete list of
> >>inodes in the filesystem.
> >>
> >
> >The serial console won't happen, but I think it's no longer necessary
> >either. This is probably not an XFS bug, right?
> >
> Well, in memory corruption of xfs data structures should not be
> triggerable by
> losing an interrupt, I would like to track it down some more. Forget kdb
> if you
> cannot do the console - we were talking a lot of output here. I may ask you
> to run some sanity check code in the sync path - you said your oops was
> repeatable, correct?
Yes, it is, although it takes some time. I suppose if I had waited to reboot
the machine when it gave me the "hdc: lost interrupt" it might have turned
into an Oops eventually.
Right now I have printk's everywhere that I think m_inext gets set and I
didn't see it getting any strange values up to and until the "lost
interrupt" message. Perhaps I should have waited a bit longer before
rebooting. What other code would you like me to add?
> Steve
>
> p.s. can you send me the script, I could look back in the xfs maillist,
> but I am feeling
> lazy, I am currently using something I wrote based on the brief
> description in this
> thread.
dd if=/dev/urandom of=01 bs=1024 count=8192
#!/bin/bash
cp -fr 01 2
for (( i=80; i!=2; i-- )) ; do
cp -fr 01 $i &
# echo $i
done
|