On Thu, Nov 10, 2011 at 02:35:18PM -0600, Alex Elder wrote:
> From: Kevan Rehm <kfr@xxxxxxx>
> This patch adds a new "resvsp" command to xfs_db. The command
> provides access to the xfsctl(3) XFS_IOC_RESVSP64 operation, which
> allocates space in an ordinary file. Blocks are allocated but not
> zeroed, and the file size does not change.
The commit message fails to answer the most important question of
all: what is the use case that requires adding offline preallocation
of extents to arbitrary inodes?
Conceptually, I don't think high level functionality like extent
allocation belongs in xfs_db:
- extent allocation is effectively an open-ended, multi-step
modification that can touch large amounts of metadata
across all AGs in the filesystem,
- userspace extent allocation code is rarely used and as
such nowhere near as well tested as the kernel code.
- userspace extent allocation does not support real time
device allocation at all:
#define xfs_bmap_rtalloc(a) (ENOSYS)
- modifications made by xfs_db are not transactional and
therefore not recoverable if the system dies during such
an operation. Such a failure means you have to run a
xfs_reapir pass on your filesystem to clean up the mess
that was left behind....
- we already have tools to do space reservation online
(i.e. xfs_io -c "resvsp off len" testfile) in a safe
manner that also supports realtime device allocation. Why
duplicate this functionality in a different tool?
> +static void
> + dbprintf(_(
> +"The resvsp function is essentially an implementation of the xfsctl(3)\n"
> +"file operation XFS_IOC_RESVSP64 which allocates space in an ordinary\n"
Probably no need to mention the ioctl here - preallocation is a well
enough known function now that simply calling it "an extent
preallocation operation" is sufficient to explain what it is for.
> Blocks are allocated but not zeroed, and the file size does not\n"
> +"change. The -o option is the starting offset for the allocation (default
> +"and the -l option gives the length of the allocation in bytes (default to\n"
> +"end of file). Both offset and length values will be rounded to a
> +"block boundary. 'inode#' is the inode number of the file in which to\n"
> +"perform the allocation. If none is specified, the current inode is used.\n"
> +"If the -w option is specified, the allocated extents will not be flagged
> +"unwritten. Use this option with care, as someone with read permission\n"
> +"to the file can then read whatever is written in those blocks.\n"
I'm on record as being totally against making it easy for -anyone-
to preallocate _written_ extents via -any method- because it's just
a whopping great big security hole. "Use care" is not a good enough
protection against the potential of exposing large amount of stale
data to any user that can read the file, even though only root can
do that preallocation.