[Top] [All Lists]

Re: hole punching performance

To: "Bradley C. Kuszmaul" <kuszmaul@xxxxxxxxx>
Subject: Re: hole punching performance
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 3 Jan 2013 16:51:01 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <CAKSyJXf5bs4wfM4k-o+1p6zOLyH46U4eorFDS8Zzsb5W7uvwPg@xxxxxxxxxxxxxx>
References: <CAKSyJXf66H2U-BF-aYnSr2fF24_6LJw6swOx1RhUc_3Eqayaiw@xxxxxxxxxxxxxx> <20130102232706.GD3120@dastard> <CAKSyJXf5bs4wfM4k-o+1p6zOLyH46U4eorFDS8Zzsb5W7uvwPg@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Jan 02, 2013 at 08:45:22PM -0500, Bradley C. Kuszmaul wrote:
> Thanks for the help.  I got results similar to yours.  However, the
> hole punching is much faster if you create the file with fallocate
> than if you actually write some data into it.
>  fallocate and then hole-punch is about 1us per hole punch.
>  write and then hole-punch is about 90us per hole punch.

No surprise - after a write the hole punch has a lot more to do.
I modified the test program to not use O_TRUNC, then ran:

$ /usr/sbin/xfs_io -f -c "truncate 0" -c "pwrite -b 1m 0 20g" /mnt/scratch/blah
wrote 21474836480/21474836480 bytes at offset 0
20.000 GiB, 20480 ops; 0:00:30.00 (675.049 MiB/sec and 675.0491 ops/sec)
$ sync
$ time ./a.out

real    0m1.664s
user    0m0.000s
sys     0m1.656s

Why? perf top indicates that pretty quickly:

 12.80%  [kernel]  [k] free_hot_cold_page
 10.62%  [kernel]  [k] block_invalidatepage
 10.62%  [kernel]  [k] _raw_spin_unlock_irq
  8.35%  [kernel]  [k] kmem_cache_free
  6.07%  [kernel]  [k] _raw_spin_unlock_irqrestore
  3.65%  [kernel]  [k] put_page
  3.51%  [kernel]  [k] __wake_up_bit
  3.27%  [kernel]  [k] find_get_pages
  2.84%  [kernel]  [k] get_pageblock_flags_group
  2.66%  [kernel]  [k] cancel_dirty_page
  2.09%  [kernel]  [k] truncate_inode_pages_range

The page cache has to have holes punched in it after the write. So,
lets rule that out by discarding it separately, and see just what
the extent manipulation overhead is:

$ rm -f /mnt/scratch/blah
$ /usr/sbin/xfs_io -f -c "truncate 0" -c "pwrite -b 1m 0 20g" /mnt/scratch/blah
wrote 21474836480/21474836480 bytes at offset 0
20.000 GiB, 20480 ops; 0:00:27.00 (749.381 MiB/sec and 749.3807 ops/sec)
$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
$ time ./a.out

real    0m0.347s
user    0m0.000s
sys     0m0.332s

Which is the same as the fallocate/punch method gives....

> But 90us is likely to be plenty fast, so it's looking good.  ( I'll
> try to track down why my other program was slow.)

If you open the file O_SYNC or O_DSYNC, then you'll still get
synchronous behaviour....


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>