.TH xfs 5
.SH NAME
xfs \- layout of the XFS filesystem
.SH DESCRIPTION
An XFS filesystem can reside on a regular disk partition or on a
logical volume.
An XFS filesystem has up to three parts:
a data section, a log section, and a realtime section.
Using the default
.IR mkfs.xfs (8)
options, the realtime section is absent, and
the log area is contained within the data section.
The log section can be either separate from the data section
or contained within it.
The filesystem sections are divided into a certain number of
.IR blocks ,
whose size is specified at
.IR mkfs.xfs
time with the
.B \-b
option.
.PP
The data section contains all the filesystem metadata
(inodes, directories, indirect blocks)
as well as the user file data for ordinary (non-realtime) files
and the log area if the log is
.I internal
to the data section.
The data section is divided into a number of
\f2allocation groups\f1.
The number and size of the allocation groups are chosen by
.I mkfs.xfs
so that there is normally a small number of equal-sized groups.
The number of allocation groups controls the amount of parallelism
available in file and block allocation.
It should be increased from
the default if there is sufficient memory and a lot of allocation
activity.
The number of allocation groups should not be set very high,
since this can cause large amounts of CPU time to be used by
the filesystem, especially when the filesystem is nearly full.
More allocation groups are added (of the original size) when
.IR xfs_growfs (8)
is run.
.PP
The log section (or area, if it is internal to the data section)
is used to store changes to filesystem metadata while the
filesystem is running until those changes are made to the data
section.
It is written sequentially during normal operation and read only
during mount.
When mounting a filesystem after a crash, the log
is read to complete operations that were
in progress at the time of the crash.
.PP
The realtime section is used to store the data of realtime files.
These files had an attribute bit set through
.IR ioctl (2)
after file creation, before any data was written to the file.
The realtime section is divided into a number of
.I extents
of fixed size (specified at
.I mkfs.xfs
time).
Each file in the realtime section has an extent size that
is a multiple of the realtime section extent size.
.PP
Each allocation group contains several data structures.
The first sector contains the superblock.
For allocation groups after the first,
the superblock is just a copy and is not updated after
.IR mkfs.xfs .
The next three sectors contain information for block and inode
allocation within the allocation group.
Also contained within each allocation group are data structures
to locate free blocks and inodes;
these are located through the header structures.
.PP
Each XFS filesystem is labeled with a Universal Unique
Identifier (UUID).
The UUID is stored in every allocation group header and
is used to help distinguish one XFS filesystem from another,
therefore you should avoid using
.I dd
or other block-by-block copying programs to copy XFS filesystems.
If two XFS filesystems on the same machine have the same UUID,
.IR xfsdump (8)
may become confused when doing incremental and resumed dumps.
.I xfsdump
and
.I xfsrestore
are recommended for making copies of XFS filesystems.
.SH OPERATIONS
Some functionality specific to the XFS filesystem is accessible to
applications through the Linux
.IR ioctl (2)
interface.
These operations can be divided into two sections - operations
that operate on individual files, and operations that operate on
the filesystem itself.
Care should be taken when issuing these XFS
.I ioctl
calls to ensure the target file descriptor does indeed represent a
file from an XFS filesystem.
The
.IR fstatfs (2)
system call can be used to determine whether or not an arbitrary
file descriptor belongs to an XFS filesystem.
.SS FILE OPERATIONS
In order to effect an operation on an individual file, the descriptor
argument passed to
.I ioctl
identifies the file being operated on.
The third argument described below refers to the third argument of the
.I ioctl
system call (which is traditionally a
.B "char *"
or
.BR "void *" ).
All of the data structures and macros mentioned below are defined in the
<xfs/xfs_fs.h> header file.
.PP
.nf
.B XFS_IOC_FREESP
.B XFS_IOC_FREESP64
.B XFS_IOC_ALLOCSP
.fi
.PD 0
.TP
.B XFS_IOC_ALLOCSP64
Alter storage space associated with a section of the ordinary
file fildes. The section is specified by a variable of type
.BR xfs_flock64_t ,
pointed to by the third argument.
The data type
.B xfs_flock64_t
contains the following members:
.B l_whence
is 0, 1, or 2 to indicate that the relative offset
.B l_start
will be measured from the start of the file, the current position, or
the end of the file, respectively.
.B l_start
is the offset from the position specified in
.BR l_whence .
.B l_len
is the size of the section. An
.B l_len
value of zero frees up to the end of the file; in this case, the end of
file (i.e., file size) is set to the beginning of the section freed.
Any data previously written into this section is no longer accessible.
If the section specified is beyond the current end of file, the file
is grown and filled with zeroes.
The
.B l_len
field is currently ignored,
and should be set to zero.
.B XFS_IOC_FREESP
and
.BR XFS_IOC_FREESP64
are identical, as are the
.B XFS_IOC_ALLOCSP
and
.BR XFS_IOC_ALLOCSP64
operations.
.TP
.B XFS_IOC_FSSETDM
Set the di_dmevmask and di_dmstate fields in an XFS on-disk inode.
The only legitimate values for these fields are those
previously returned in the
.B bs_dmevmask
and
.B bs_dmstate
fields of the bulkstat structure.
The data referred to by the third argument is a
.BR "struct fsdmidata" .
This structure's members are
.B fsd_dmevmask
and
.BR fsd_dmstate .
The di_dmevmask
field is set to the value in
.BR fsd_dmevmask .
The di_dmstate field is set to the value in
.BR fsd_dmstate .
This command is restricted to root or to processes with device
management capabilities.
Its sole purpose is to allow backup and restore programs to restore the
aforementioned critical on-disk inode fields.
.TP
.B XFS_IOC_DIOINFO
Get information required to perform direct I/O on the specified file
descriptor.
Direct I/O is performed directly to and from a user's data buffer.
Since the kernel's buffer cache is no longer between the two, the
user's data buffer must conform to the same type of constraints as
required for accessing a raw disk partition.
The third argument points to a variable of type
.BR "struct dioattr" ,
which contains the following members:
.B d_mem
is the memory alignment requirement of the user's data buffer.
.B d_miniosz
specifies block size, minimum I/O request size, and I/O alignment.
The size of all I/O requests must be a multiple of this amount and
the value of the seek pointer at the time of the I/O request must
also be an integer multiple of this amount.
.B d_maxiosz
is the maximum I/O request size which can be performed on the file
descriptor.
If an I/O request does not meet these constraints, the
.IR read (2)
or
.IR write (2)
will fail with EINVAL.
All I/O requests are kept consistent with any data brought into
the cache with an access through a non-direct I/O file descriptor.
.TP
.B XFS_IOC_FSGETXATTR
Get extended attributes associated with files in XFS file systems.
The third argument points to a variable of type
.BR "struct fsxattr" ,
whose fields include:
.B fsx_xflags
(extended flag bits),
.B fsx_extsize
(nominal extent size in file system blocks),
.B fsx_nextents
(number of data extents in the file),
.B fsx_uuid
(file unique id).
Currently the only meaningful bits for the
.B fsx_xflags
field are bit 0 (value 1), which if set means the file is a realtime file,
and bit 1 (value 2), which if set means the file has preallocated space.
A
.B fsx_extsize
value returned indicates that a preferred extent size was previously
set on the file, a
.B fsx_extsize
of zero indicates that the defaults for that filesystem will be used.
.TP
.B XFS_IOC_FSGETXATTRA
Identical to
.B XFS_IOC_FSGETXATTR
except that the
.B fsx_nextents
field contains the number of attribute extents in the file.
.TP
.B XFS_IOC_FSSETXATTR
Set extended attributes associated with files in XFS file systems.
The third argument points to a variable of type
.BR "struct fsxattr" ,
but only the following fields are used in this call:
.B fsx_xflags
and
.BR fsx_extsize .
The
.B fsx_xflags
realtime file bit, and the file's extent size, may be changed only
when the file is empty.
.TP
.B XFS_IOC_GETBMAP
Get the block map for a segment of a file in an XFS file system.
The third argument points to an arry of variables of type
.BR "struct getbmap" .
All sizes and offsets in the structure are in units of 512 bytes.
The structure fields include:
.B bmv_offset
(file offset of segment),
.B bmv_block
(starting block of segment),
.B bmv_length
(length of segment),
.B bmv_count
(number of array entries, including the first), and
.B bmv_entries
(number of entries filled in).
The first structure in the array is a header, and the remaining
structures in the array contain block map information on return.
The header controls iterative calls to the
.B XFS_IOC_GETBMAP
command.
The caller fills in the
.B bmv_offset
and
.B bmv_length
fields of the header to indicate the area of interest in the file,
and fills in the
.B bmv_count
field to indicate the length of the array.
If the
.B bmv_length
value is set to -1 then the length of the interesting area is the rest
of the file.
On return from a call, the header is updated so that the command can be
reused to obtain more information, without re-initializing the structures.
Also on return, the
.B bmv_entries
field of the header is set to the number of array entries actually filled in.
The non-header structures will be filled in with
.BR bmv_offset ,
.BR bmv_block ,
and
.BR bmv_length .
If a region of the file has no blocks (is a hole in the file) then the
.B bmv_block
field is set to -1.
.TP
.B XFS_IOC_GETBMAPA
Identical to
.B XFS_IOC_GETBMAP
except that information about the attribute fork of the file is returned.
.PP
.nf
.B XFS_IOC_RESVSP
.fi
.TP
.B XFS_IOC_RESVSP64
This command is used to allocate space to a file.
A range of bytes is specified using a pointer to a variable of type
.B xfs_flock64_t
in the third argument.
The blocks are allocated, but not zeroed, and the file size does not change.
If the XFS filesystem is configured to flag unwritten file extents,
performance will be negatively affected when writing to preallocated space,
since extra filesystem transactions are required to convert extent flags on
the range of the file written.
If
.IR xfs_info (8)
reports unwritten=1, then the filesystem was made to flag unwritten extents.
.PP
.nf
.B XFS_IOC_UNRESVSP
.fi
.TP
.B XFS_IOC_UNRESVSP64
This command is used to free space from a file.
A range of bytes is specified using a pointer to a variable of type
.B xfs_flock64_t
in the third argument.
Partial filesystem blocks are zeroed, and whole filesystem blocks are
removed from the file. The file size does not change.
.\" .TP
.\" .B XFS_IOC_GETBIOSIZE
.\" This command gets information about the preferred buffered I/O
.\" size used by the system when performing buffered I/O (e.g.
.\" standard Unix non-direct I/O) to and from the file.
.\" The information is passed back in a structure of type
.\" .B "struct biosize"
.\" pointed to by the third argument.
.\" biosize lengths are expressed in log base 2.
.\" That is if the value is 14, then the true size is 2^14 (2 raised to
.\" the 14th power).
.\" The
.\" .B biosz_read
.\" field will contain the current value used by the system when reading
.\" from the file.
.\" Except at the end-of-file, the system will read from the file in
.\" multiples of this length.
.\" The
.\" .B biosz_write
.\" field will contain the current value used by the system when writing
.\" to the file.
.\" Except at the end-of-file, the system will write to the file in
.\" multiples of this length.
.\" The
.\" .B dfl_biosz_read
.\" and
.\" .B dfl_biosz_write
.\" will be set to the system default values for the opened file.
.\" The
.\" .B biosz_flags
.\" field will be set to 1 if the current read or write value has been
.\" explicitly set.
.\"
.\" .TP
.\" .B XFS_IOC_SETBIOSIZE
.\" This command sets information about the preferred buffered I/O size
.\" used by the system when performing buffered I/O (e.g. standard Unix
.\" non-direct I/O) to and from the file.
.\" The information is passed in a structure of type
.\" .B "struct biosize"
.\" pointed to by the third argument.
.\" Using smaller preferred I/O sizes can result in performance
.\" improvements if the file is typically accessed using small
.\" synchronous I/Os or if only a small amount of the file is accessed
.\" using small random I/Os, resulting in little or no use of the
.\" additional data read in near the random I/Os.
.\"
.\" To explicitly set the preferred I/O sizes, the
.\" .B biosz_flags
.\" field should be set to zero and the
.\" .B biosz_read
.\" and
.\" .B biosz_write
.\" fields should be set to the log base 2 of the desired read and
.\" write lengths, respectively (e.g. 13 for 8K bytes, 14 for 16K
.\" bytes, 15 for 32K bytes, etc.). Valid values are 13-16
.\" inclusive for machines with a 4K byte pagesize and 14-16 for
.\" machines with a 16K byte pagesize. The specified read and
.\" write values must also result in lengths that are greater than
.\" or equal to the filesystem block size.
.\" The
.\" .B dfl_biosz_read
.\" and
.\" .B dfl_biosz_write
.\" fields are ignored.
.\"
.\" If biosizes have already been explicitly set due to a prior use
.\" of
.\" .BR XFS_IOC_SETBIOSIZE ,
.\" and the requested sizes are larger than the
.\" existing sizes, the
.\" .I ioctl
.\" call will return successfully and the
.\" system will use the smaller of the two sizes. However, if
.\" .B biosz_flags
.\" is set to 1, the system will use the new values
.\" regardless of whether the new sizes are larger or smaller than the old.
.\"
.\" To reset the biosize values to the defaults for the filesystem
.\" that the file resides in, the
.\" .B biosz_flags
.\" field should be set to 2.
.\" The remainder of the fields will be ignored in that case.
.\"
.\" Changes made by
.\" .B XFS_IOC_SETBIOSIZE
.\" are transient.
.\" The sizes are reset to the default values once the reference count on the
.\" file drops to zero (e.g. all open file descriptors to that file
.\" have been closed).
.\" See
.\" .I mount(8)
.\" for details on how to set the
.\" default biosize values for a filesystem.
.PP
.nf
.B XFS_IOC_PATH_TO_HANDLE
.B XFS_IOC_PATH_TO_FSHANDLE
.B XFS_IOC_FD_TO_HANDLE
.B XFS_IOC_OPEN_BY_HANDLE
.B XFS_IOC_READLINK_BY_HANDLE
.B XFS_IOC_ATTR_LIST_BY_HANDLE
.B XFS_IOC_ATTR_MULTI_BY_HANDLE
.fi
.PD 0
.TP
.B XFS_IOC_FSSETDM_BY_HANDLE
These are all interfaces that are used to implement various
.I libhandle
functions (see
.IR fd_to_handle (3)).
They are all subject to change and should not be called directly
by applications.
.SS FILESYSTEM OPERATIONS
In order to effect one of the following operations, the file descriptor
argument passed to
.I ioctl
can be any open file in the XFS filesystem in question.
.PP
.TP
.B XFS_IOC_FSINUMBERS
This interface is used to extract a list of valid inode numbers from an
XFS filesystem.
It is intended to be called iteratively, to obtain the entire set of inodes.
The information is passed in and out via a structure of type
.B xfs_fsop_bulkreq_t
pointed to by the third argument.
.B lastip
is a pointer to a variable containing the last inode number returned,
initially it should be zero.
.B icount
is the size of the array of structures specified by
.BR ubuffer .
.B ubuffer
is the address of an array of structures, of type
.BR xfs_inogrp_t .
This structure has the following elements:
.B xi_startino
(starting inode number),
.B xi_alloccount
(count of bits set in xi_allocmask), and
.B xi_allocmask
(mask of allocated inodes in this group).
The bitmask is 64 bits long, and the least significant bit corresponds to inode
.B xi_startino.
Each bit is set if the corresponding inode is in use.
.B ocount
is a pointer to a count of returned values, filled in by the call.
An output
.B ocount
value of zero means that the inode table has been exhausted.
.TP
.B XFS_IOC_FSBULKSTAT
This interface is used to extract inode information (stat
information) "in bulk" from a filesystem. It is intended to
be called iteratively, to obtain information about the entire
set of inodes in a filesystem.
The information is passed in and out via a structure of type
.B xfs_fsop_bulkreq_t
pointed to by the third argument.
.B lastip
is a pointer to a variable containing the last inode number returned,
initially it should be zero.
.B icount
indicates the size of the array of structures specified by
.B ubuffer.
.B ubuffer
is the address of an array of structures of type
.BR xfs_bstat_t .
Many of the elements in the structure are the same as for the stat
structure.
The structure has the following elements:
.B bs_ino
(inode number),
.B bs_mode
(type and mode),
.B bs_nlink
(number of links),
.B bs_uid
(user id),
.B bs_gid
(group id),
.B bs_rdev
(device value),
.B bs_blksize
(block size of the filesystem),
.B bs_size
(file size in bytes),
.B bs_atime
(access time),
.B bs_mtime
(modify time),
.B bs_ctime
(inode change time),
.B bs_blocks
(number of blocks used by the file),
.B bs_xflags
(extended flags),
.B bs_extsize
(extent size),
.B bs_extents
(number of extents),
.B bs_gen
(generation count),
.B bs_projid
(project id),
.B bs_dmevmask
(DMIG event mask),
.B bs_dmstate
(DMIG state information), and
.B bs_aextents
(attribute extent count).
.B ocount
is a pointer to a count of returned values, filled in by the call.
An output
.B ocount
value of zero means that the inode table has been exhausted.
.TP
.B XFS_IOC_FSBULKSTAT_SINGLE
This interface is a variant of the
.B XFS_IOC_FSBULKSTAT
interface, used to obtain information about a single inode.
for an open file in the filesystem of interest.
The same structure is used to pass information in and out of
the kernel, except no output count parameter is used (should
be initialized to zero).
An error is returned if the inode number is invalid.
.PP
.nf
.B XFS_IOC_THAW
.B XFS_IOC_FREEZE
.B XFS_IOC_GET_RESBLKS
.B XFS_IOC_SET_RESBLKS
.B XFS_IOC_FSGROWFSDATA
.B XFS_IOC_FSGROWFSLOG
.B XFS_IOC_FSGROWFSRT
.fi
.TP
.B XFS_IOC_FSCOUNTS
These interfaces are used to implement various filesystem internal
operations on XFS filesystems.
For
.B XFS_IOC_FSGEOMETRY
(get filesystem mkfs time information), the output structure is of type
.BR xfs_fsop_geom_t .
For
.B XFS_FS_COUNTS
(get filesystem dynamic global information), the output structure is of type
.BR xfs_fsop_counts_t .
The remainder of these operations will not be described further
as they are not of general use to applications.
.SH MOUNT OPTIONS
Refer to the
.IR mount (8)
manual entry for descriptions of the individual XFS mount options.
.SH SEE ALSO
ioctl(2),
fstatfs(2),
mount(8),
mkfs.xfs(8),
xfs_info(8),
xfs_admin(8),
xfsdump(8),
xfsrestore(8).