[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Corruption on ext3 with XFS kernel



Juri Haberland schrieb:
> 
> Simon Matter wrote:
> > Hi,
> >
> > I have seven IBM deathstar 60G disks here which have been replaced. I
> > want to use four of them in my home server so I wanted to do some tests
> > to make sure they don't break in the first week. I have connected four
> > drives to a Promise Ultra100TX2 IDE controller and put it into my test
> > server. The box is running kernel-2.4.18-18SGI_XFS_1.2pre3. I have
> > created a softraid5 on the disks creating a 180G device. I then created
> > an ext3 fs with default settings. Mounted the fs on /mnt/md9, mounted my
> > real servers data on /mnt/nfs. Then I used cp -a /mnt/nfs
> > /mnt/md9/nfs[n] five times creating five identical copies of the nfs
> > mounted data (/mnt/md9/nfs1, /mnt/md9/nfs2...) with a size of 28G in
> > 16500 file in each copy (143G used / 81%).
> > Then I did two times five diff's running at the same time and I was very
> > surprised what came out:
> 
> [SNIP]
> 
> > Now, it looks like a problem with ext3. My question is, could it have
> > _something_ to do with the XFS patch in this kernel? Did anybody do
> > similar tests? Unfortunately this box has XFS root so I can't just
> > switch to vanilla or original RedHat kernel and every test takes me
> > ~1day. Is there a way to find out what's going wrong here?
> 
> You can try to apply the three bugfix-patches for 2.4.20/ext3 at
> http://www.zip.com.au/~akpm/linux/ext3/ (see "Updates for the 2.4.20
> kernel").

RedHat has an updated kernel too, but this kernel fixes a problem which
only affects ext3 filesystems mounted with data=journal, see
http://rhn.redhat.com/errata/RHBA-2002-292.html

I have now tested the same with 2.4.9-34SGI_XFS_1.1 and the error
doesn't occur. My next step is building kernel-2.4.18-19SGI_XFS_1.2pre5,
which is the newest errata kernel and XFS 1.2pre5 added.

Simon

> 
> Regards,
> Juri