Received: (from majordomo@localhost) by oss.sgi.com (8.11.3/8.11.3) id f34N47203797 for linux-xfs-outgoing; Wed, 4 Apr 2001 16:04:07 -0700 Received: from dr-zaius.ximian.com (dr-zaius.ximian.com [141.154.95.23]) by oss.sgi.com (8.11.3/8.11.3) with SMTP id f34N46M03794 for ; Wed, 4 Apr 2001 16:04:07 -0700 Received: (qmail 7160 invoked by uid 1021); 4 Apr 2001 23:03:46 -0000 Date: Wed, 4 Apr 2001 19:03:46 -0400 From: Michael MacDonald To: linux-xfs@oss.sgi.com Subject: reproducible wedge/corruption Message-ID: <20010404190346.D6693@dr-zaius.ximian.com> Reply-To: mjmac@ximian.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.12i X-Sender: mjmac@ximian.com Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk Hi all. I've been lurking for a while, trying to get a feel for how XFS is working for people. It seems to be working well overall, which is a little disappointing for me because I'm not having much luck with it. Perhaps I'm not living right, or something. Anyhow, I have two machines with similar hardware on which I can pretty reliably cause the system to lock solid. I've tried different kernels (linux-xfs-beta vs. linux-xfs; 2.4.1 -> 2.4.3), but the problem doesn't go away. The kernel is configured with the bare minimum to run an IDE-based server. To try to reproduce the error that users were running into, I wrote a simple script to pound on the fs. It hasn't made it past the first or second iteration on an XFS partition, because the machine will hang. Can't log in at the console or anything. I tried changing the logbsize parameter as described in the FAQ, and got better performance until the machine hung like it did before. I gave up after several days of tweaking and reformatted ext2. I would wonder if I have something misconfigured except for the fact that the script runs through all 30 iterations on an ext2 filesystem without a hang. I'll post the script if requested, but basically all it does is: create 10000 small files in one directory make 10000 symlinks in another directory to the files in the first remove the second directory remove the first directory It seems to be hanging mostly during the removal stage, although I have seen it hang during file creation. On two of the hangs, I have seen some nasty file corruption. The specifics of the system are as follows: Dell PowerEdge 350 Celeron 600 128MB RAM Intel 440BX chipset I've got another 350 with PIII 700 512MB RAM They both seem to exhibit similar behavior wrt XFS, although I haven't pounded on the second one quite as much. If anyone has any ideas, or would like more details, please let me know. I would really like to get XFS on these things, as they've got a lot of disk between the two of them. -- Michael MacDonald Systems Monkey mjmac@ximian.com Ximian, Inc.