On Thu, Jul 02, 2015 at 08:47:53AM -0400, Jan Tulak wrote:
>
>
> ----- Original Message -----
> > From: "Brian Foster" <bfoster@xxxxxxxxxx>
> > To: "Jan ÅulÃk" <jtulak@xxxxxxxxxx>
> > Cc: "Dave Chinner" <dchinner@xxxxxxxxxx>, xfs@xxxxxxxxxxx
> > Sent: Thursday, June 25, 2015 9:37:48 PM
> > Subject: Re: [PATCH 01/17] xfsprogs: use common code for multi-disk
> > detection
> >
> > On Fri, Jun 19, 2015 at 01:01:50PM +0200, Jan ÅulÃk wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > >
> > > Both xfs_repair and mkfs.xfs need to agree on what is a "multidisk:
> > > configuration - mkfs for determining the AG count of the filesystem,
> > > repair for determining how to automatically parallelise it's
> > > execution. This requires a bunch of common defines that both mkfs
> > > and reapir need to share.
> > >
> > > In fact, most of the defines in xfs_mkfs.h could be shared with
> > > other programs (i.e. all the defaults mkfs uses) and so it is
> > > simplest to move xfs_mkfs.h to the shared include directory and add
> > > the new defines to it directly.
> > >
> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > > Signed-off-by: Jan ÅulÃk <jtulak@xxxxxxxxxx>
> > > ---
> > > include/Makefile | 8 ++++-
> > > include/xfs_mkfs.h | 98
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > mkfs/Makefile | 2 +-
> > > mkfs/xfs_mkfs.c | 56 +++++++++++++++---------------
> > > mkfs/xfs_mkfs.h | 89 ------------------------------------------------
> > > repair/xfs_repair.c | 45 ++++++++++++++++++++++--
> > > 6 files changed, 178 insertions(+), 120 deletions(-)
> > > create mode 100644 include/xfs_mkfs.h
> > > delete mode 100644 mkfs/xfs_mkfs.h
> > >
> > > diff --git a/include/Makefile b/include/Makefile
> > > index 70e43a0..3269ec3 100644
> > > --- a/include/Makefile
> > > +++ b/include/Makefile
> > > @@ -26,9 +26,15 @@ QAHFILES = libxfs.h libxlog.h \
> > > xfs_inode.h \
> > > xfs_log_recover.h \
> > > xfs_metadump.h \
> > > + xfs_mkfs.h \
> > > xfs_mount.h \
> > > + xfs_quota_defs.h \
> > > + xfs_sb.h \
> > > + xfs_shared.h \
> > > xfs_trace.h \
> > > - xfs_trans.h
> > > + xfs_trans.h \
> > > + xfs_trans_resv.h \
> > > + xfs_trans_space.h
> > >
> > > HFILES = handle.h jdm.h xqm.h xfs.h
> > > HFILES += $(PKG_PLATFORM).h
> > > diff --git a/include/xfs_mkfs.h b/include/xfs_mkfs.h
> > > new file mode 100644
> > > index 0000000..3388f6d
> > > --- /dev/null
> > > +++ b/include/xfs_mkfs.h
> > > @@ -0,0 +1,98 @@
> > > +/*
> > > + * Copyright (c) 2000-2001,2004-2005 Silicon Graphics, Inc.
> > > + * All Rights Reserved.
> > > + *
> > > + * This program is free software; you can redistribute it and/or
> > > + * modify it under the terms of the GNU General Public License as
> > > + * published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope that it would be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > > + * GNU General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License
> > > + * along with this program; if not, write the Free Software Foundation,
> > > + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > > + */
> > > +#ifndef __XFS_MKFS_H__
> > > +#define __XFS_MKFS_H__
> > > +
> > > +#define XFS_DFL_SB_VERSION_BITS \
> > > + (XFS_SB_VERSION_NLINKBIT | \
> > > + XFS_SB_VERSION_EXTFLGBIT | \
> > > + XFS_SB_VERSION_DIRV2BIT)
> > > +
> > > +#define XFS_SB_VERSION_MKFS(crc,ia,dia,log2,attr1,sflag,ci,more) (\
> > > + ((crc)||(ia)||(dia)||(log2)||(attr1)||(sflag)||(ci)||(more)) ? \
> > > + (((crc) ? XFS_SB_VERSION_5 : XFS_SB_VERSION_4) | \
> > > + ((ia) ? XFS_SB_VERSION_ALIGNBIT : 0) | \
> > > + ((dia) ? XFS_SB_VERSION_DALIGNBIT : 0) | \
> > > + ((log2) ? XFS_SB_VERSION_LOGV2BIT : 0) | \
> > > + ((attr1) ? XFS_SB_VERSION_ATTRBIT : 0) | \
> > > + ((sflag) ? XFS_SB_VERSION_SECTORBIT : 0) | \
> > > + ((ci) ? XFS_SB_VERSION_BORGBIT : 0) | \
> > > + ((more) ? XFS_SB_VERSION_MOREBITSBIT : 0) | \
> > > + XFS_DFL_SB_VERSION_BITS | \
> > > + 0 ) : XFS_SB_VERSION_1 )
> > > +
> > > +#define XFS_SB_VERSION2_MKFS(crc, lazycount, attr2, projid32bit, parent,
> > > \
> > > + ftype) (\
> > > + ((lazycount) ? XFS_SB_VERSION2_LAZYSBCOUNTBIT : 0) | \
> > > + ((attr2) ? XFS_SB_VERSION2_ATTR2BIT : 0) | \
> > > + ((projid32bit) ? XFS_SB_VERSION2_PROJID32BIT : 0) | \
> > > + ((parent) ? XFS_SB_VERSION2_PARENTBIT : 0) | \
> > > + ((crc) ? XFS_SB_VERSION2_CRCBIT : 0) | \
> > > + ((ftype) ? XFS_SB_VERSION2_FTYPE : 0) | \
> > > + 0 )
> > > +
> > > +#define XFS_DFL_BLOCKSIZE_LOG 12 /* 4096 byte blocks */
> > > +#define XFS_DINODE_DFL_LOG 8 /* 256 byte inodes */
> > > +#define XFS_DINODE_DFL_CRC_LOG 9 /* 512 byte inodes for
> > > CRCs */
> > > +#define XFS_MIN_DATA_BLOCKS 100
> > > +#define XFS_MIN_INODE_PERBLOCK 2 /* min inodes per block
> > > */
> > > +#define XFS_DFL_IMAXIMUM_PCT 25 /* max % of space for
> > > inodes */
> > > +#define XFS_IFLAG_ALIGN 1 /* -i align defaults on
> > > */
> > > +#define XFS_MIN_REC_DIRSIZE 12 /* 4096 byte dirblocks
> > > (V2) */
> > > +#define XFS_DFL_DIR_VERSION 2 /* default directory
> > > version */
> > > +#define XFS_DFL_LOG_SIZE 1000 /* default log size,
> > > blocks */
> > > +#define XFS_DFL_LOG_FACTOR 5 /* default log size,
> > > factor */
> > > + /* with max trans reservation */
> > > +#define XFS_MAX_INODE_SIG_BITS 32 /* most significant
> > > bits in an
> > > + * inode number that we'll
> > > + * accept w/o warnings
> > > + */
> > > +
> > > +#define XFS_AG_BYTES(bblog) ((long long)BBSIZE << (bblog))
> > > +#define XFS_AG_MIN_BYTES ((XFS_AG_BYTES(15))) /* 16 MB */
> > > +#define XFS_AG_MIN_BLOCKS(blog) ((XFS_AG_BYTES(15)) >> (blog))
> > > +#define XFS_AG_MAX_BLOCKS(blog) ((XFS_AG_BYTES(31) - 1) >> (blog))
> > > +
> > > +#define XFS_MAX_AGNUMBER ((xfs_agnumber_t)(NULLAGNUMBER - 1))
> > > +
> > > +/*
> > > + * These values define what we consider a "multi-disk" filesystem. That
> > > is, a
> > > + * filesystem that is likely to be made up of multiple devices, and hence
> > > have
> > > + * some level of parallelism avoid to it at the IO level.
> > > + */
> > > +#define XFS_MULTIDISK_AGLOG 5 /* 32 AGs */
> > > +#define XFS_NOMULTIDISK_AGLOG 2 /* 4 AGs */
> > > +#define XFS_MULTIDISK_AGCOUNT (1 << XFS_MULTIDISK_AGLOG)
> > > +
> > > +
> > > +/* xfs_mkfs.c */
> > > +extern int isdigits (char *str);
> > > +extern long long cvtnum (unsigned int blocksize,
> > > + unsigned int sectorsize, char *s);
> > > +
> > > +/* proto.c */
> > > +extern char *setup_proto (char *fname);
> > > +extern void parse_proto (xfs_mount_t *mp, struct fsxattr *fsx, char
> > > **pp);
> > > +extern void res_failed (int err);
> > > +
> > > +/* maxtrres.c */
> > > +extern int max_trans_res (int crcs_enabled, int dirversion,
> > > + int sectorlog, int blocklog, int inodelog, int dirblocklog,
> > > + int logversion, int log_sunit);
> > > +
> > > +#endif /* __XFS_MKFS_H__ */
> > > diff --git a/mkfs/Makefile b/mkfs/Makefile
> > > index fd1f615..82326e0 100644
> > > --- a/mkfs/Makefile
> > > +++ b/mkfs/Makefile
> > > @@ -8,7 +8,7 @@ include $(TOPDIR)/include/builddefs
> > > LTCOMMAND = mkfs.xfs
> > > FSTYP = fstyp
> > >
> > > -HFILES = xfs_mkfs.h
> > > +HFILES =
> > > CFILES = maxtrres.c proto.c xfs_mkfs.c
> > >
> > > ifeq ($(ENABLE_BLKID),yes)
> > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> > > index 83f7749..d0de90d 100644
> > > --- a/mkfs/xfs_mkfs.c
> > > +++ b/mkfs/xfs_mkfs.c
> > > @@ -24,7 +24,7 @@
> > > #include <disk/fstyp.h>
> > > #include <disk/volume.h>
> > > #endif
> > > -#include "xfs_mkfs.h"
> > > +#include <xfs/xfs_mkfs.h>
> > >
> > > /*
> > > * Device topology information.
> > > @@ -688,43 +688,45 @@ calc_default_ag_geometry(
> > > }
> > >
> > > /*
> > > - * For the remainder we choose an AG size based on the
> > > - * number of data blocks available, trying to keep the
> > > - * number of AGs relatively small (especially compared
> > > - * to the original algorithm). AG count is calculated
> > > - * based on the preferred AG size, not vice-versa - the
> > > - * count can be increased by growfs, so prefer to use
> > > - * smaller counts at mkfs time.
> > > - *
> > > - * For a single underlying storage device between 128MB
> > > - * and 4TB in size, just use 4 AGs, otherwise scale up
> > > - * smoothly between min/max AG sizes.
> > > + * For a single underlying storage device between 128MB and 4TB in size
> > > + * just use 4 AGs and scale up smoothly between min/max AG sizes.
> > > */
> > > -
> > > - if (!multidisk && dblocks >= MEGABYTES(128, blocklog)) {
> > > + if (!multidisk) {
> > > if (dblocks >= TERABYTES(4, blocklog)) {
> > > blocks = XFS_AG_MAX_BLOCKS(blocklog);
> > > goto done;
> > > + } else if (dblocks >= MEGABYTES(128, blocklog)) {
> > > + shift = XFS_NOMULTIDISK_AGLOG;
> > > + goto calc_blocks;
> > > }
> > > - shift = 2;
> > > - } else if (dblocks > GIGABYTES(512, blocklog))
> > > - shift = 5;
> > > - else if (dblocks > GIGABYTES(8, blocklog))
> > > - shift = 4;
> > > - else if (dblocks >= MEGABYTES(128, blocklog))
> > > - shift = 3;
> > > - else if (dblocks >= MEGABYTES(64, blocklog))
> > > - shift = 2;
> > > - else if (dblocks >= MEGABYTES(32, blocklog))
> > > - shift = 1;
> > > - else
> > > - shift = 0;
> > > + }
> > > +
> > > + /*
> > > + * For the multidisk configs we choose an AG count based on the number
> > > + * of data blocks available, trying to keep the number of AGs higher
> > > + * than the single disk configurations. This makes the assumption that
> > > + * larger filesystems have more parallelism available to them.
> > > + */
> > > + shift = XFS_MULTIDISK_AGLOG;
> > > + if (dblocks < GIGABYTES(512, blocklog))
> > > + shift--;
> > > + if (dblocks < GIGABYTES(8, blocklog))
> > > + shift--;
> > > + if (dblocks < MEGABYTES(128, blocklog))
> > > + shift--;
> > > + if (dblocks < MEGABYTES(64, blocklog))
> > > + shift--;
> > > + if (dblocks < MEGABYTES(32, blocklog))
> > > + shift--;
> > > +
> >
> > Intended change in behavior of the defaults for fs' that match these
> > size thresholds (the 512g and 8g ones anyways)? For example, in the old
> > code a 512GB fs gets a shift of 4 while the same fs gets shift = 5 in
> > the new code.
> >
> > Brian
>
> When I look on the code, where did you got the 4 vs 5? In the old code, for
> 512GB and bigger is assigned shift=5 directly. In the new one, shift is set
> to XFS_MULTIDISK_AGLOG which is 5, and then, if the disk is smaller than
> 512GB, it decrements the value. But unless I'm missing something, the
> multidisk configuration is not changing anything, there is just a different
> syntax.
>
I was referring to the multidisk case. The old code looks like this:
if (!multidisk && dblocks >= MEGABYTES(128, blocklog)) {
...
} else if (dblocks > GIGABYTES(512, blocklog))
shift = 5;
else if (dblocks > GIGABYTES(8, blocklog))
shift = 4;
... which means if multidisk && dblocks == 512GB, then shift is set to
4. With the new code, we set XFS_MULTIDISK_AGLOG as you noted and then
execute:
if (dblocks < GIGABYTES(512, blocklog))
shift--;
...
... which will not decrement shift if dblocks == 512GB (i.e., shift is
5).
If you're still not convinced, create an exact sized 512GB file, mkfs it
(with the su/sw options set for multidisk) with and without this change
and observe agcount. :)
Brian
> Cheers,
> Jan
> --
> Jan Tulak
> jtulak@xxxxxxxxxx
|