Here's v3 of sparse inode chunk suport for XFS. The primary update for
this version is an update to how inodes are aligned when sparse inode
support is enabled. Inode chunks are currently aligned to cluster size
to support single I/O per inode cluster. This means that the minimum
block range between two non-adjacent inode chunks is the cluster size.
Cluster size is also the granularity of sparse allocation. Therefore, it
is possible to allocate a cluster size chunk that cannot be converted to
an inode record due to overlap on both sides (ambiguous metadata). The
only recourse in this situation is to undo the allocation and likely
return ENOSPC. Given the added complexity of that and the already
complicated inode allocation path, an approach that avoids this
potential condition in the first place is preferred.
To address this situation, inode alignment is increased from cluster
size to chunk size (by mkfs) when sparse inode chunks are enabled. This
guarantees that the minimum block range between two non-adjacent inode
chunks is at least big enough for one full chunk. This greatly
simplifies sparse inode record management. Allocations occur at cluster
size granularity and are shifted into inode records that align to chunk
size. In other words, for any particular sparse allocation, a well known
record startino is determined by aligning the agbno of the allocation to
the chunk size.
The increased inode chunk alignment does limit the ability to allocate
full inode chunks on a significantly populated fs, but what is lost in
that regard is regained by the ability to allocate sparse records in any
AG that can satisfy the minimum free space requirement.
Other changes in this version include marking the feature as
experimental, a block allocation agbno range limit to avoid invalid
inode records at AG boundaries, DEBUG mode allocation logic to improve
test coverage, etc. This series survives xfstests regression runs on
basic v5 configurations as well as some longer term debug mode fsstress
testing. Thoughts, reviews, flames appreciated!
 - This can also be mitigated by future work to consider allocation
of inode chunks in batches rather than one at a time.
- Rebase to latest for-next (bulkstat rework, data structure shuffling,
- Fix issparse helper logic.
- Update inode alignment model w/ spinodes enabled. All inode records
are chunk size aligned, sparse allocations cluster size aligned (both
enforced on mount).
- Reworked sparse inode record merge logic to coincide w/ new alignment
- Mark feature as experimental (warn on mount).
- Include and use block allocation agbno range limit to prevent
allocation of invalid inode records.
- Add some DEBUG bits to improve sparse alloc. test coverage.
- Use a manually set feature bit instead of dynamic based on the
existence of sparse inode chunks.
- Add sb/mp fields for sparse alloc. granularity (use instead of cluster
- Undo xfs_inobt_insert() loop removal to avoid breakage of larger page
- Rename sparse record overlap helper and do XFS_LOOKUP_LE search.
- Use byte of pad space in inobt record for inode count field.
- Convert bitmap mgmt to use generic bitmap code.
- Rename XFS_INODES_PER_SPCHUNK to XFS_INODES_PER_HOLEMASK_BIT.
- Add fs geometry bit for sparse inodes.
- Rebase to latest for-next (bulkstat refactor).
Brian Foster (18):
xfs: add sparse inode chunk alignment superblock field
xfs: use sparse chunk alignment for min. inode allocation requirement
xfs: sparse inode chunks feature helpers and mount requirements
xfs: introduce inode record hole mask for sparse inode chunks
xfs: create macros/helpers for dealing with sparse inode chunks
xfs: pass inode count through ordered icreate log item
xfs: handle sparse inode chunks in icreate log recovery
xfs: helpers to convert holemask to/from generic bitmap
xfs: support min/max agbno args in block allocator
xfs: allocate sparse inode chunks on full chunk allocation failure
xfs: randomly do sparse inode allocations in DEBUG mode
xfs: filter out sparse regions from individual inode allocation
xfs: update free inode record logic to support sparse inode records
xfs: only free allocated regions of inode chunks
xfs: skip unallocated regions of inode chunks in xfs_ifree_cluster()
xfs: use actual inode count for sparse records in bulkstat/inumbers
xfs: add fs geometry bit for sparse inode chunks
xfs: enable sparse inode chunks for v5 superblocks
fs/xfs/libxfs/xfs_alloc.c | 42 ++-
fs/xfs/libxfs/xfs_alloc.h | 2 +
fs/xfs/libxfs/xfs_format.h | 33 +-
fs/xfs/libxfs/xfs_fs.h | 1 +
fs/xfs/libxfs/xfs_ialloc.c | 651 ++++++++++++++++++++++++++++++++++++---
fs/xfs/libxfs/xfs_ialloc.h | 17 +-
fs/xfs/libxfs/xfs_ialloc_btree.c | 4 +-
fs/xfs/libxfs/xfs_sb.c | 31 +-
fs/xfs/xfs_fsops.c | 4 +-
fs/xfs/xfs_inode.c | 28 +-
fs/xfs/xfs_itable.c | 14 +-
fs/xfs/xfs_log_recover.c | 23 +-
fs/xfs/xfs_mount.c | 16 +
fs/xfs/xfs_mount.h | 2 +
fs/xfs/xfs_trace.h | 47 +++
15 files changed, 836 insertions(+), 79 deletions(-)