Hmm, thanks for the info, both the dumps come from lines where ext2 is
accessing the b_data field of buffers it just got back from cache for
the first time. It looks like the b_data field is NULL, this is a
legitimate state for b_data, but not for ext2 metadata buffers, so,
it may be a mix of ext2 and xfs running in parallel is part of the
problem here, but xfs does not mess with b_data directly, and does not
use the same buffer head based caching of metadata that ext2 does.
More to come on this one.
Steve
>
> > This is a little scary, the xfs and ext2 filesystems are on different devic
> es,
> > otherwise I would question the raid driver, or are they? Your device number
> s
> > are a little different, where are the partitions you used for the tests?
>
> Both the sda and sdb devices are separate RAID volumes off the same
> controller - sda is a RAID1 (dual 9GB drives) and sdb is a RAID5 (six 73GB
> drives).
>
> Start mounting filesystem: sd(8,17)
> Ending clean XFS mount for filesystem: sd(8,17)
> Start mounting filesystem: sd(8,18)
> Ending clean XFS mount for filesystem: sd(8,18)
>
> Those device numbers are strange? Thye correlate to /dev/sdb1 and
> /dev/sdb2. I was using:
> TEST_DEV="/dev/sdb1"
> TEST_DIR="/tst1"
> SCRATCH_DEV="/dev/sdb2"
> SCRATCH_MNT="/tst2"
>
> > I have not seen anything like this before, it smells a little of a buffer
> > head getting stamped on. Would it be possible to add disassemblies of the
> > ext2 functions you have hit corruption in? i.e. just run gdb on vmlinux
> > and disassemble them, otherwise it is hard to figure out which memory
> > reference was at fault.
>
> Sure, I am rerunning and will put up dissassemblies as I catch them.
> Unfortunately I 'cvs update'd and recompiled without saving the old
> vmlinux. I have two new bt's up now, combined with disassemble's of the
> ext2 functions that were running when the oopses occured at
> http://www.csee.uq.edu.au/~chrisp/xfs/disasm-20010523. I hope I've done
> the right thing!!
>
> I'm compiling six new kernels on the machine at the moment, with a few
> combinations of highmem on/highmem off, Pentium II vs Pentium III
> selected, SMP vs Non-SMP, to see if changing any of these makes a
> difference. I think I can possibly disable the RAID controller too,
> converting it back into a standard aic7xxx - will look into that hopefully
> tomorrow too.
>
> Chris
|