On Wed, Nov 13, 2013 at 05:40:59PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
>
> Large filesystems or high AG count filesystems generally have more
> inherent parallelism in the backing storage. We shoul dmake use of
> this by default to speed up repair times. Make xfs_repair use an
> "auto-stride" configuration on filesystems with enough AGs to be
> considered "multidisk" configurations.
>
> This difference in elaspsed time to repair a 100TB filesystem with
> 50 million inodes in it with all metadata in flash is:
>
> Time IOPS BW CPU RAM
> vanilla: 2719s 2900 55MB/s 25% 0.95GB
> patched: 908s varied varied varied 2.33GB
>
> With the patched kernel, there were IO peaks of over 1.3GB/s during
> AG scanning. Some phases now run at noticably different speeds
> - phase 3 ran at ~180% CPU, 18,000 IOPS and 130MB/s,
> - phase 4 ran at ~280% CPU, 12,000 IOPS and 100MB/s
> - the other phases were similar to the vanilla repair.
>
> Memory usage is increased because of the increased buffer cache
> size as a result of concurrent AG scanning using it.
Looks good as long as you stick your promise to clean up the magic
numbers later.
Reviewed-by: Christoph Hellwig <hch@xxxxxx>
|