about the xfs performance
Songbo Wang
hack.coo at gmail.com
Tue Apr 12 09:07:45 CDT 2016
Hi Dave,
Thank you for your reply. I did some test today and described those as
follows:
Delete the existing test file , and redo the test : fio -ioengine=libaio
-bs=4k -direct=1 -thread -rw=randwrite -size=50G -filename=/mnt/test
-name="EBS 4KB randwrite test" -iodepth=64 -runtime=60
The iops resultes is 19k±(per second); I continue to fio this test file
untill it was filled to the full. Then I did another test using the same
test case, the results was 210k±(per second).(The results mentioned
yesterday was partial. I used the same test file several times, the
results degraded because of the test file was not fill to the full)
I try to remake the filesystem using the following command to increase the
internal log size , inode size and agcount num:
mkfs.xfs /dev/hioa2 -f -n size=64k -i size=2048,align=1 -d agcount=2045 -l
size=512m
but it has no help to the result.
Any suggestion to deal with this problems ?
I very appreciate your feedback.
songbo
2016-04-12 7:10 GMT+08:00 Dave Chinner <david at fromorbit.com>:
> On Mon, Apr 11, 2016 at 10:14:06PM +0800, Songbo Wang wrote:
> > Hi xfsers:
> >
> > I got some troubles on the performance of xfs.
> > The environment is ,
> > xfs version is 3.2.1,
> > centos 7.1,
> > kernel version:3.10.0-229.el7.x86_64.
> > pcie-ssd card,
> > mkfs: mkfs.xfs /dev/hioa2 -f -n size=64k -i size=512 -d agcount=40
> -l
> > size=1024m.
> > mount: mount /dev/hioa2 /mnt/ -t xfs -o
> >
> rw,noexec,nodev,noatime,nodiratime,nobarrier,discard,inode64,logbsize=256k,delaylog
> > I use the following command to test iops: fio -ioengine=libaio -bs=4k
> > -direct=1 -thread -rw=randwrite -size=50G -filename=/mnt/test -name="EBS
> > 4KB randwrite test" -iodepth=64 -runtime=60
> > The results is normal at the beginning which is about 210k±,but some
> > seconds later, the results down to 19k±.
>
> Looks like the workload runs out of log space due to all the
> allocation transactions being logged, which then causes new
> transactions to start tail pushing the log to flush dirty metadata.
> This is needed to to make more space in the log for on incoming dio
> writes that require allocation transactions. This will block IO
> submission until there is space available in the log.
>
> Let's face it, all that test does is create a massively fragmented
> 50GB file, so you're going to have a lot of metadata to log. Do the
> maths - if it runs at 200kiops for a few seconds, it's created a
> million extents.
>
> And it's doing random insert on the extent btree, so
> it's repeatedly dirtying the entire extent btree. This will trigger
> journal commits quite frequently as this is a large amount of
> metadata that is being dirtied. e.g. at 500 extent records per 4k
> block, a million extents will require 2000 leaf blocks to store them
> all. That's 80MB of metadata per million extents that this workload
> is generating and repeatedly dirtying.
>
> Then there's also other metadata, like the free space btrees, that
> is also being repeatedly dirtied, etc, so it would not be unexpected
> to see a workload like this on high IOPS devices allocating 100MB of
> metadata every few seconds and the amount being journalled steadily
> increasing until the file is fully populated.
>
> > I did a senond test ,
> > umount the /dev/hioa2,
> > fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite
> > -filename=/dev/hioa2 -name="EBS 8KB randwrite test" -iodepth=64
> -runtime=60
> > The results was normal, the iops is about 210k± all the time.
>
> That's not an equivalent test - it's being run direct to the block
> device, not to a file on the filesytem on the block device, and so
> you won't see artifacts taht are a result of creating worst case
> file fragmentation....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david at fromorbit.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20160412/fe1479c1/attachment.html>
More information about the xfs
mailing list