XFS performance oddity
Dave Chinner
david at fromorbit.com
Tue Nov 23 21:15:24 CST 2010
On Wed, Nov 24, 2010 at 11:50:03AM +1100, Nick Piggin wrote:
> On Wed, Nov 24, 2010 at 07:58:04AM +1100, Dave Chinner wrote:
> > On Tue, Nov 23, 2010 at 11:24:49PM +1100, Nick Piggin wrote:
> > > Hi,
> > >
> > > Running parallel fs_mark (0 size inodes, fsync on close) on a ramdisk
> > > ends up with XFS in funny patterns.
> > >
> > > procs -----------memory---------- ---swap-- -----io---- -system--
> > > ----cpu----
> > > r b swpd free buff cache si so bi bo in cs us sy
> > > id wa
> > > 24 1 6576 166396 252 393676 132 140 16900 80666 21308 104333 1 84 14 1
> > > 21 0 6712 433856 256 387080 100 224 9152 53487 13677 53732 0 55 45 0
> > > 2 0 7068 463496 248 389100 0 364 2940 17896 4485 26122 0 33 65 2
> > > 1 0 7068 464340 248 388928 0 0 0 0 66 207 0 0 100 0
> > > 0 0 7068 464340 248 388928 0 0 0 0 79 200 0 0 100 0
> > > 0 0 7068 464544 248 388928 0 0 0 0 65 199 0 0 100 0
> > > 1 0 7068 464748 248 388928 0 0 0 0 79 201 0 0 100 0
> > > 0 0 7068 465064 248 388928 0 0 0 0 66 202 0 0 100 0
> > > 0 0 7068 465312 248 388928 0 0 0 0 80 200 0 0 100 0
> > > 0 0 7068 465500 248 388928 0 0 0 0 65 199 0 0 100 0
> > > 0 0 7068 465500 248 388928 0 0 0 0 80 202 0 0 100 0
> > > 1 0 7068 465500 248 388928 0 0 0 0 66 203 0 0 100 0
> > > 0 0 7068 465500 248 388928 0 0 0 0 79 200 0 0 100 0
> > > 23 0 7068 460332 248 388800 0 0 1416 8896 1981 7142 0 1 99 0
> > > 6 0 6968 360248 248 403736 56 0 15568 95171 19438 110825 1 79 21 0
> > > 23 0 6904 248736 248 419704 392 0 17412 118270 20208 111396 1 82 17 0
> > > 9 0 6884 266116 248 435904 128 0 14956 79756 18554 118020 1 76 23 0
> > > 0 0 6848 219640 248 445760 212 0 9932 51572 12622 76491 0 60 40 0
> > >
> > > Got a dump of sleeping tasks. Any ideas?
> >
> > It is stuck waiting for log space to be freed up. Generally this is
> > caused by log IO completion not occurring or an unflushable object
> > preventing the tail from being moved forward. What:
>
> Yeah it's strage, it seems like it hits some timeout or gets kicked
> along by background writeback or something. Missed wakeup somewhere?
No idea yet.
> > - is the output of mkfs.xfs?
>
> meta-data=/dev/ram0 isize=256 agcount=16, agsize=65536 blks
> = sectsz=512 attr=2
> data = bsize=4096 blocks=1048576, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=16384, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
Ok, small log, small AGs.
> > - are your mount options?
>
> mount -o delaylog,logbsize=262144,nobarrier /dev/ram0 mnt
>
> > - is the fs_mark command line?
>
> ../fs_mark -S1 -k -n 1000 -L 100 -s 0 -d scratch/0 -d scratch/1 -d
> scratch/2 -d scratch/3 -d scratch/4 -d scratch/5 -d scratch/6 -d
> scratch/7 -d scratch/8 -d scratch/9 -d scratch/10 -d scratch/11 -d
> scratch/12 -d scratch/13 -d scratch/14 -d scratch/15 -d scratch/16 -d
> scratch/17 -d scratch/18 -d scratch/19 -d scratch/20 -d scratch/21 -d
> scratch/22 -d scratch/23
> for f in scratch/* ; do rm -rf $f & done ; wait
Ok, so you are effectively doing a concurrent synchronous create of 2.4M
zero byte files.
BTW, how many CPU cores does your machine have? if it's more than 8, then
you're probably getting a fair bit of serialisation on the per-ag
structures. I normally use agcount=num_cpus * 2 for scalability
testing when running one load thread per CPU.
> Ran it again, and yes it has locked up for a long long time, it seems
> to be in the rm phase, but I think I've seen similar stall (although not
> so long) in the fs_mark phase too.
Ok, I've just reproduced a couple of short hangs (a few seconds)
during the rm phase so I should be able to track it down.
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list