Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f5SHS3503608 for linux-xfs-outgoing; Thu, 28 Jun 2001 10:28:03 -0700 Received: from relay.xlink.net (relay.xlink.net [193.141.40.4]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f5SHS1V03605 for ; Thu, 28 Jun 2001 10:28:01 -0700 Received: from lizard.webland.de (lizard.webland.de [194.122.76.201]) by relay.xlink.net (8.9.3/8.8.7) with ESMTP id TAA15172 for ; Thu, 28 Jun 2001 19:27:59 +0200 (MET DST) Received: (from uucp@localhost) by lizard.webland.de (8.8.8/8.8.7) id TAA06183 for linux-xfs@oss.sgi.com; Thu, 28 Jun 2001 19:27:58 +0200 (MET DST) >Received: from mobile.sauter-bc.com (unknown [10.1.6.21]) by basel1.sauter-bc.com (Postfix) with ESMTP id B9D0457306 for ; Thu, 28 Jun 2001 19:37:12 +0200 (CEST) Received: from ch.sauter-bc.com (support.cad.sba [10.1.200.117]) by mobile.sauter-bc.com (Postfix) with ESMTP id E128A25835 for ; Thu, 28 Jun 2001 19:45:23 +0200 (CEST) Message-ID: <3B3B6A45.3252B37C@ch.sauter-bc.com> Date: Thu, 28 Jun 2001 19:32:53 +0200 From: Simon Matter Organization: Sauter AG, Basel X-Mailer: Mozilla 4.77 [de] (X11; U; Linux 2.2.19-6.2.7 i686) X-Accept-Language: de-CH, en MIME-Version: 1.0 To: linux-xfs Subject: XFS corruption on SoftRAID5 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk I don't know what to try anymore... I'm getting XFS filesystem corruption, I can see this when using ls or du. Filenames are corrupted and also their content. I was doing the first test on a DELL Precision Workstation with 4 IDE drives but I have changed now. So let me explain what exactly I'm doing. I have set up a DELL PowerEdge 1400 Server, PIII800 / 256MB / ServerWorks CNB20LE. Two U160-SCSI Disks on the first onboard controller (AIC-7899). /, /boot and 2x1GB swap are on SoftRAID1 on those two disks. Until here, no problem at all, using kernel PR1-PR3. Then I installed one Promise Ultra100TX2 IDE controller, connecting 4 IBM 60GB drives. I created 1 RAID5 on those 4 drives. I then do a 'sysctl -w dev.raid.speed_limit_min=10000' to resync the raid faster, otherwise it takes days to sync. Then while syncing, I create an XFS filsystem on it and mount it on /home. Now I copy some GB of data from 2 NFS servers (while it is still syncing). This is going slow because of high priority syncing, but beside that, not problem at all. Later then, after the sync has finished and after some reboots, I just made an ls -R /home and found out that the filnames were corrupt. I know that what I did is a torture for the system, but it should be able to handle such situations. Can somebody tell me what could cause the problem. Could it be the combination of RAID5 / XFS / syncing / heavy load? Unfortunately there is absolutely noting to find in the kernel logs. My next steps before giving up: - I have installed a second Prosime controller to make sure every IDE disk has it's own channel. (Don't blame Promise, I had exactly the same prob with the i820 IDE of the DELL Precision 220). Test is running right now... - Configuring the 4 IDE disks as RAID10 and test again. I will loose 60GB, but at least we then know that SoftRAID5 with IDE with XFS with ... with ... is DANGEROUS(tm). - Try with ext2 on the RAID5 :-( Thanks in advance for any help Simon