Seth Mos schrieb:
>
> At 10:29 29-6-2001 +0200, Simon Matter wrote:
> >Seth Mos schrieb:
> >
> >I'm not complaining.
>
> Your not. The coffee here must have been a bit on the strong side.
>
> > > Maybe a bit harsh but the md author might just be listening on the
> > > linux-kernel list.
> >
> >Until today, it seemed to be XFS related.
>
> Oh. I thought you noticed it earlier.
I meant XFS/SoftRAID related. I read that IBM JFS does not work on
SoftRAID, so I thought maybe there is also something with XFS.
>
> > > The people here understand XFS all too well but they don't know the
> > > complete kernel in and out (could be wrong though). Another problem is
> > > that
> > > they unfortunately don't really have the time to fix all sorts of
> > kernel bugs.
> > >
> >
> >You're right. But on this list we have all those people using big disks
> >and raid volumes. So if the problem was somehow XFS/SoftRAID related,
> >where could I ask.
>
> True, but a lot of them are using hardware raid either IDE or scsi or fibre
> based.
I know, unfortunately, otherwise this error was found earlier...
>
> > > If you can produce a testcase in which you can generate corruption on the
> > > fs no matter what the fs is that would be helpful. Are you just seeing
> > > file
> > > names being garbled or ar the files themeselves also corrupt. What does a
> > > xfs_repair mention when you try to check it? Does it even report anything
> > > on that matter at all or does it decide to core dump because it's checking
> > > swiss cheese?
> >
> >It's the filnames and the files themselve. The hole blockdevice seems to
> >be corrupted.
> >Its not XFS,not SoftRAID.
> >Its something in the IDE subsystem.
>
> What IDE controller was it? A promise I believe? I unfortunately don't have
> experience with those controllers except for a Promise Ultra66 controller.
> You don't happen to have another IDE controller to test it with do you :)
I did, with the onboard controller of a DELL Precision220 WS. It's using
Intel i820 chipset. I was trying RAID1 there but at this time I just
blamed the Intel CS.
>
> Do you also see a certain pattern in the fs corruption or is it just
> /dev/random ?
I didn't investigate deeper, but it looks like /dev/random
>
> > > Can you check out the CVS tree and build a kernel with that to simulate
> > > it.
> > > 2.4.5+ makes a big difference relative to 2.4.3. There have been some raid
> > > fixes in the past time. And 2.4.6 is approaching in a rapid pace.
> > >
> > > I'm placing my bet on the next version being 2.4.6.
> > >
> > > If you build a new kernel with the CVS tree (currently at 2.4.6-pre6) and
> > > can test if you see corruption again that would be helpful. Then we at
> > > least now what issues remain for the 1.0.1 installer. Although shipping a
> > > 2.4.5 in 1.0.1 might not be possible.
> >
> >Just tried rawhide 2.4.5-20010613 and it's exactly the same.
>
> Crap, so much for my theory. Oh well.
>
Yes, but this time it's the Linux Kernel! At least the RedHat tuned one.
I found a way to reproduce it now. It's not the SoftRAID code, as I
manage to get corruption even without RAID. Just heavy load on all four
disks.
Well, here I found something...
http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=44327
Simon
|