On Tue, Nov 29, 2005 at 01:36:11AM +0100, Andi Kleen wrote:
>
> I just found an new exciting way to break XFS. Or rather the
> version that's in 2.6.13. But might be a interesting try anyways.
So I just ran this on a handy altix I had lying about between other
testing ;)
It's currently not running 2.6.13, but I'm going to run this again
against 2.6.14 just to make sure it's not a regression. I suspect that
the problem is that you're generating a highly fragmented file
which requires high-order memory allocations to hold the extent list.
> You likely need a 64bit system for this.
Check.
> I created a large holey file on XFS with
>
> # (the funny number is about the maximum that ext2 supports)
> dd if=/dev/zero of=LARGE bs=1 count=4096 seek=$[8*1024*1024*1024*1024-2*4096]
> losetup /dev/loop0 LARGE
> mkfs.ext2 /dev/loop0
>
> now wait until it has written a few thousands of its inode tables
> and then press ctrl-c. mkfs.ext2 will close the loop device which
> causes a sync. And then it will hang for a very long time
> until loop starts spewing out IO errors and then it deadlocks completely.
> The mkfs process is busy waiting for its sync, loop0 does:
First I tried aborting the mkfs but I didn't see any hangs, so I let
mkfs.ext2 run to completion - it dirtied all of memory (~23GiB) and
then it spent most of the time writing to disk at 300-400MB/s:
budgie:~ # vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
7 0 416096 27760 17509664 5153888 45 6 66 397116 5725 11054 0 45 46
9
3 0 416096 28096 17299648 5363776 51 10 98 393118 5638 6693 1 46 43
10
10 0 416096 27984 16831344 5831968 26 0 38 440788 5627 11416 0 50 42
7
2 0 416096 21712 17438816 5241232 22 19 246 362036 5737 8968 1 38 51
10
7 0 416096 27312 18031552 4631792 67 0 193 347848 5574 7725 0 37 51
11
And kept more than a million pages under writeback for the entire
runtime. As the number of inode tables had been written out
increased, the write out rate slows a bit. When mkfs completes, a
sync is done and everything is good.
But I can now guess why a smaller machine might hang on this test - you're
creating a file with a massive number of extents:
budgie:/usr/local/aspen/loadgen # xfs_bmap -v /mnt/dgc/stripe/LARGE |wc -l
217708
budgie:/usr/local/aspen/loadgen #
Which is what makes me think you're having problems with high order
memory allocations at substantially lower numbers of extents. You're
testing on a machine with 4k pages, right?
Cheers,
Dave.
--
Dave Chinner
R&D Software Enginner
SGI Australian Software Group
|