Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id f5SLmJh18174 for linux-xfs-outgoing; Thu, 28 Jun 2001 14:48:19 -0700 Received: from pneumatic-tube.sgi.com (pneumatic-tube.sgi.com [204.94.214.22]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id f5SLmHV18168 for ; Thu, 28 Jun 2001 14:48:17 -0700 Received: from zeus-fddi.americas.sgi.com (zeus-fddi.americas.sgi.com [128.162.8.103]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id OAA01382 for ; Thu, 28 Jun 2001 14:45:28 -0700 (PDT) mail_from (lord@sgi.com) Received: from daisy-e185.americas.sgi.com (daisy.americas.sgi.com [128.162.185.214]) by zeus-fddi.americas.sgi.com (8.9.3/americas-smart-nospam1.1) with ESMTP id QAA2289846; Thu, 28 Jun 2001 16:46:56 -0500 (CDT) Received: from jen.americas.sgi.com (IDENT:root@jen.americas.sgi.com [128.162.187.49]) by daisy-e185.americas.sgi.com (SGI-8.9.3/SGI-server-1.7) with ESMTP id QAA36337; Thu, 28 Jun 2001 16:46:56 -0500 (CDT) Received: from jen.americas.sgi.com by jen.americas.sgi.com (8.11.2/SGI-client-1.7) via ESMTP id f5SLmfw24451; Thu, 28 Jun 2001 16:48:41 -0500 Message-Id: <200106282148.f5SLmfw24451@jen.americas.sgi.com> X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 To: Simon Matter cc: linux-xfs Subject: Re: XFS corruption on SoftRAID5 In-Reply-To: Message from Simon Matter of "Thu, 28 Jun 2001 19:32:53 +0200." <3B3B6A45.3252B37C@ch.sauter-bc.com> Date: Thu, 28 Jun 2001 16:48:41 -0500 From: Steve Lord Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk > I don't know what to try anymore... First rule of bug reporting, which version of the kernel are you using? Oh, and what type of NFS servers? Details, details please. Steve > > I'm getting XFS filesystem corruption, I can see this when using ls or > du. Filenames are corrupted and also their content. I was doing the > first test on a DELL Precision Workstation with 4 IDE drives but I have > changed now. So let me explain what exactly I'm doing. > > I have set up a DELL PowerEdge 1400 Server, PIII800 / 256MB / > ServerWorks CNB20LE. Two U160-SCSI Disks on the first onboard controller > (AIC-7899). /, /boot and 2x1GB swap are on SoftRAID1 on those two disks. > Until here, no problem at all, using kernel PR1-PR3. Then I installed > one Promise Ultra100TX2 IDE controller, connecting 4 IBM 60GB drives. I > created 1 RAID5 on those 4 drives. I then do a 'sysctl -w > dev.raid.speed_limit_min=10000' to resync the raid faster, otherwise it > takes days to sync. Then while syncing, I create an XFS filsystem on it > and mount it on /home. Now I copy some GB of data from 2 NFS servers > (while it is still syncing). This is going slow because of high priority > syncing, but beside that, not problem at all. Later then, after the sync > has finished and after some reboots, I just made an ls -R /home and > found out that the filnames were corrupt. I know that what I did is a > torture for the system, but it should be able to handle such situations. > Can somebody tell me what could cause the problem. Could it be the > combination of RAID5 / XFS / syncing / heavy load? Unfortunately there > is absolutely noting to find in the kernel logs. > > My next steps before giving up: > - I have installed a second Prosime controller to make sure every IDE > disk has it's own channel. (Don't blame Promise, I had exactly the same > prob with the i820 IDE of the DELL Precision 220). Test is running right > now... > > - Configuring the 4 IDE disks as RAID10 and test again. I will loose > 60GB, but at least we then know that SoftRAID5 with IDE with XFS with > ... with ... is DANGEROUS(tm). > > - Try with ext2 on the RAID5 :-( > > Thanks in advance for any help > > Simon >