Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.12.3/8.12.3) with ESMTP id g5R78unC003296 for ; Thu, 27 Jun 2002 00:08:56 -0700 Received: (from majordomo@localhost) by oss.sgi.com (8.12.3/8.12.3/Submit) id g5R78u99003295 for linux-xfs-outgoing; Thu, 27 Jun 2002 00:08:56 -0700 X-Authentication-Warning: oss.sgi.com: majordomo set sender to owner-linux-xfs@oss.sgi.com using -f Received: from smtpzilla1.xs4all.nl (smtpzilla1.xs4all.nl [194.109.127.137]) by oss.sgi.com (8.12.3/8.12.3) with SMTP id g5R78lnC003267 for ; Thu, 27 Jun 2002 00:08:48 -0700 Received: from auto-nb1.xs4all.nl (213-84-127-28.adsl.xs4all.nl [213.84.127.28]) by smtpzilla1.xs4all.nl (8.12.0/8.12.0) with ESMTP id g5R7CHTC012710; Thu, 27 Jun 2002 09:12:17 +0200 (CEST) Message-Id: <4.3.2.7.2.20020627090504.03c4f4a0@pop.xs4all.nl> X-Sender: knuffie@pop.xs4all.nl X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Date: Thu, 27 Jun 2002 09:12:03 +0200 To: Libor Vanek , linux-xfs@oss.sgi.com From: Seth Mos Subject: Re: XFS corruption! In-Reply-To: <3D1AAB70.4060400@conet.cz> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Spam-Status: No, hits=-3.9 required=5.0 tests=IN_REP_TO,PLING version=2.20 X-Spam-Level: Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk At 08:06 27-6-2002 +0200, Libor Vanek wrote: >Hi, >we are selling Linux file servers and we wanted to use XFS. Our internal >tests passed OK but when we installed first server at customer and >migrated data an error occured (usually after copying 60-100 GB). In >/var/log/messages we saw this messages: One of the developers better comment on those messages. >We tried migrating 160 GB of data using "cp -a" (over NFS), scp and rsync >from old server using RH7.0 (ext2) - all resulted in this. >The system is running software RAID5 (10x60GB), 1 GHz Celeron, 128 MB RAM, >standard RH7.3 with SGI XFS modified installation CD. >When we rebooted system everything seems OK (nothing lost) but after >copying few more MB the same error occurs. >We have built up 2 VERY same machines from same system image and both >behave the very same so I think that it's some software failure. It sounds like it. Did you build this filesystem with any special mkfs options? What IDE controllers are you using? Did you use the 2.4.18 kernel that came on the installer disk or is this a selfcompiled version or even a CVS checkout? >I have stress tested system with doing lot of "dd if=/dev/md0 of=/raid/tmp >bs=10MB count=100" and recursive directories (about 50 levels deep) and >nothing similar occured. Only when copying data over network from the old >system. Weird. I frequently have to copy large amounts of data over the network and it works fine so I suspect that something in your filesystem is not right and causing it to fail again as soon as you try to copy to it again. Can you check/repair the filesystem and see if it appears again? Cheers -- Seth It might just be your lucky day, if you only knew.