Pascal Haakmat wrote:
10/01/02 21:10, Stephen Lord wrote:
Pascal Haakmat wrote:
10/01/02 16:36, Steve Lord wrote:
On Thu, 2002-01-10 at 15:57, Pascal Haakmat wrote:
ASSERT(ipointer_in == B_FALSE);
ip = ip->i_mnext;
c01ccb34: 8b 4c 24 70 mov 0x70(%esp,1),%ecx
c01ccb38: 8b 76 08 mov 0x8(%esi),%esi
c01ccb3b: 8b 91 14 01 00 00 mov 0x114(%ecx),%edx
} while (ip->i_mnext != mp->m_inodes);
[*ksymoops disassembly matches here*]
ip->i_mnext is NULL which is never supposed to happen, next question is
why?
FWIW, this happened just after rebooting using the XFS 1.01/RedHat boot CD
and running xfs_repair on the filesystem, which hopefully rules out an
inconsistent filesystem/filesystem errors.
I don't think fs corruption would have much to do with this one, it is a
purely in memory
circular list. So far as I can see it is always manipulated under the
correct locking. I have
a box running a debug kernel sitting in a loop doing the test which
Adrian says makes
this happen for him. It has been going for a few hours, so far no problems.
Well, I've been doing the same, and after 68 iterations of his script I got
this pair of messages, repeating every three seconds or so (no Oops or
anything else):
ide_dmaproc: chipset supported ide_dma_lostirq func only: 13
hdc: lost interrupt
Looks like a kernel problem or bad hardware?
Would you be willing turn on kdb? It only really makes sense if you are
able to setup
a serial console. There is a debugger command which will walk the
complete list of
inodes in the filesystem.
The serial console won't happen, but I think it's no longer necessary
either. This is probably not an XFS bug, right?
Well, in memory corruption of xfs data structures should not be
triggerable by
losing an interrupt, I would like to track it down some more. Forget kdb
if you
cannot do the console - we were talking a lot of output here. I may ask you
to run some sanity check code in the sync path - you said your oops was
repeatable, correct?
Steve
p.s. can you send me the script, I could look back in the xfs maillist,
but I am feeling
lazy, I am currently using something I wrote based on the brief
description in this
thread.
|