[XFS updates] XFS development tree branch, for-linus, updated. v3.9-rc1-46-gcab09a8
xfs at oss.sgi.com
xfs at oss.sgi.com
Wed May 1 20:14:39 CDT 2013
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "XFS development tree".
The branch, for-linus has been updated
discards e001873853d87674dd5b3cfa2851885023616695 (commit)
discards 3325beed46d8d14d873e94d89ea57ee900dec942 (commit)
discards 83cdadd8b0559c93728d065d23ca3485fa567e54 (commit)
cab09a8 xfs: fix da node magic number mismatches
946217b xfs: Remote attr validation fixes and optimisations
123887e xfs: Teach dquot recovery about CONFIG_XFS_QUOTA
e721f50 xfs: implement extended feature masks
04a1e6c xfs: add CRC checks to the superblock
61fe135 xfs: buffer type overruns blf_flags field
d75afeb xfs: add buffer types to directory and attribute buffers
d2e448d xfs: add CRC protection to remote attributes
95920cd xfs: split remote attribute code out
517c222 xfs: add CRCs to attr leaf blocks
f5ea110 xfs: add CRCs to dir2/da node blocks
6b2647a xfs: shortform directory offsets change for dir3 format
24df33b xfs: add CRC checking to dir2 leaf blocks
33363fe xfs: add CRC checking to dir2 data blocks
cbc8adf xfs: add CRC checking to dir2 free blocks
f5f3d9b xfs: add CRC checks to block format directory blocks
f948dd7 xfs: add CRC checks to remote symlinks
19de735 xfs: split out symlink code into it's own file.
93848a9 xfs: add version 3 inode format with CRCs
3fe58f3 xfs: add CRC checks for quota blocks
983d09f xfs: add CRC checks to the AGI
77c95bb xfs: add CRC checks to the AGFL
4e0e604 xfs: add CRC checks to the AGF
ee1a47a xfs: add support for large btree blocks
a205064 xfs: increase hexdump output in xfs_corruption_error
7fe3258 xfs: Update xfs_log_commit_cil() comments
d4fd0e9 xfs: Remove the obsolete XLOG_CIL_HARD_SPACE_LIMIT() macros
666d644 xfs: don't free EFIs before the EFDs are committed
3d6e036 xfs: Add ratelimited printk for different alert levels
ff9a28f xfs: Fix WARN_ON(delalloc) in xfs_vm_releasepage()
19cb7e3 xfs: xfs_iomap_prealloc_size() tracepoint
76a4202 xfs: add quota-driven speculative preallocation throttling
b136645 xfs: xfs_dquot prealloc throttling watermarks and low free space
4b6eae2e xfs: pass xfs_dquot to xfs_qm_adjust_dqlimits() instead of xfs_disk_dquot_t
c9bdbdc xfs: push rounddown_pow_of_two() to after prealloc throttle
3c58b5f xfs: reorganize xfs_iomap_prealloc_size to remove indentation
56cea2d xfs: take inode version into account in XFS_LITINO
c163f9a xfs: ensure we capture IO errors correctly
d8ddfe8 xfs: Remove obsoleted m_inode_shrink from xfs_mount structure
9e5987a xfs: rearrange some code in xfs_bmap for better locality
ecb3403 xfs: rename random32() to prandom_u32()
d5929de xfs: don't verify buffers after IO errors
e8108ce xfs: fix xfs_iomap_eof_prealloc_initial_size type
e114b5f xfs: increase prealloc size to double that of the previous extent
e78c420 xfs: fix potential infinite loop in xfs_iomap_prealloc_size()
from e001873853d87674dd5b3cfa2851885023616695 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
commit cab09a81fbefcb21db5213a84461d421946f6eb8
Author: Dave Chinner <dchinner at redhat.com>
Date: Tue Apr 30 21:39:36 2013 +1000
xfs: fix da node magic number mismatches
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 946217ba28637d7a08e03e93ef40586ce621f557
Author: Dave Chinner <dchinner at redhat.com>
Date: Tue Apr 30 21:39:35 2013 +1000
xfs: Remote attr validation fixes and optimisations
- optimise the calcuation for the number of blocks in a remote
xattr.
- check attribute length against MAX_XATTR_SIZE, not MAXPATHLEN
- whitespace fixes
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 123887e8433e58ebbcc4c91491d8b8cde31d6d79
Author: Dave Chinner <dchinner at redhat.com>
Date: Tue Apr 30 21:39:33 2013 +1000
xfs: Teach dquot recovery about CONFIG_XFS_QUOTA
Fix a build error when CONFIG_XFS_QUOTA=n:
fs/built-in.o: In function `xlog_recovery_validate_buf_type':
/home/dave/src/build/x86-64/xfsdev/fs/xfs/xfs_log_recover.c:1948: undefined
reference to `xfs_dquot_buf_ops'
Reported-by: Michael L. Semon <mlsemon35 at gmail.com>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit e721f504cf46a0c84741ba2137d7a052d79436db
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:32 2013 +1100
xfs: implement extended feature masks
The version 5 superblock has extended feature masks for compatible,
incompatible and read-only compatible feature sets. Implement the
masking and mount-time checking for these feature masks.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 04a1e6c5b222b089c6960dfc5352002002a4355f
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:31 2013 +1100
xfs: add CRC checks to the superblock
With the addition of CRCs, there is such a wide and varied change to
the on disk format that it makes sense to bump the superblock
version number rather than try to use feature bits for all the new
functionality.
This commit introduces all the new superblock fields needed for all
the new functionality: feature masks similar to ext4, separate
project quota inodes, a LSN field for recovery and the CRC field.
This commit does not bump the superblock version number, however.
That will be done as a separate commit at the end of the series
after all the new functionality is present so we switch it all on in
one commit. This means that we can slowly introduce the changes
without them being active and hence maintain bisectability of the
tree.
This patch is based on a patch originally written by myself back
from SGI days, which was subsequently modified by Christoph Hellwig.
There is relatively little of that patch remaining, but the history
of the patch still should be acknowledged here.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 61fe135c1dde112f483bba01d645debd881b5428
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:30 2013 +1100
xfs: buffer type overruns blf_flags field
The buffer type passed to log recvoery in the buffer log item
overruns the blf_flags field. I had assumed that flags field was a
32 bit value, and it turns out it is a unisgned short. Therefore
having 19 flags doesn't really work.
Convert the buffer type field to numeric value, and use the top 5
bits of the flags field for it. We currently have 17 types of
buffers, so using 5 bits gives us plenty of room for expansion in
future....
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit d75afeb3d302019527331520a2632b6614425b40
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:29 2013 +1100
xfs: add buffer types to directory and attribute buffers
Add buffer types to the buffer log items so that log recovery can
validate the buffers and calculate CRCs correctly after the buffers
are recovered.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit d2e448d5fdebdcda93ed171339a3d864f65c227e
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:28 2013 +1100
xfs: add CRC protection to remote attributes
There are two ways of doing this - the first is to add a CRC to the
remote attribute entry in the attribute block. The second is to
treat them similar to the remote symlink, where each fragment has
it's own header and identifies fragment location in the attribute.
The problem with the CRC in the remote attr entry is that we cannot
identify the owner of the metadata from the metadata blocks
themselves, or where the blocks fit into the remote attribute. The
down side to this approach is that we never know when the attribute
has been read from disk or not and so we have to verify it every
time it is read, and we must calculate it during the create
transaction and log it. We do not log CRCs for any other metadata,
and so this creates a unique set of coherency problems that, in
general, are best avoided.
Adding an identifying header to each allocated block allows us to
identify each fragment and where in the attribute it is located. It
enables us to rebuild the remote attribute from just the raw blocks
containing the attribute. It also provides us to do per-block CRCs
verification at IO time rather than during the transaction context
that creates it or every time it is read into a user buffer. Hence
it avoids all the problems that an external, logged CRC has, and
provides all the benefits of self identifying metadata.
The only complexity is that we have to add a header per fragment,
and we don't know how many fragments will be needed prior to
allocations. If we take the symlink example, the header is 56 bytes
and hence for a 4k block size filesystem, in the worst case 16
headers requires 1 extra block for the 64k attribute data. For 512
byte filesystems the worst case is an extra block for every 9
fragments (i.e. 16 extra blocks in the worse case). This will be
very rare and so it's not really a major concern.
Because allocation is done in two steps - the first finds a hole
large enough in the attribute file, the second does the allocation -
we only need to find a hole big enough for a worst case allocation.
We only need to allocate enough extra blocks for number of headers
required by the fragments, and we can calculate that as we go....
Hence it really only makes sense to use the same model as for
symlinks - it doesn't add that much complexity, does not require an
attribute tree format change, and does not require logging
calculated CRC values.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 95920cd6ce1c9cd8d3a0f639a674aa26c974ed57
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:27 2013 +1100
xfs: split remote attribute code out
Adding CRC support to remote attributes adds a significant amount of
remote attribute specific code. Split the existing remote attribute
code out into it's own file so that all the relevant remote
attribute code is in a single, easy to find place.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 517c22207b045993a6529e1f8684095adaae9cf3
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 24 18:58:55 2013 +1000
xfs: add CRCs to attr leaf blocks
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit f5ea110044fa858925a880b4fa9f551bfa2dfc38
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 24 18:58:02 2013 +1000
xfs: add CRCs to dir2/da node blocks
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 6b2647a12a00bdad431ac1e9049c5e8579aa7869
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:24 2013 +1100
xfs: shortform directory offsets change for dir3 format
Because the header size for the CRC enabled directory blocks is
larger, the offset of the first entry into a directory block is
different to the dir2 format. The shortform directory stores the
dirent's offset so that it doesn't change when moving from shortform
to block form and back again, and hence it needs to take into
account the different header sizes to maintain the correct offsets.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 24df33b45ecf5ca413ef1530e0aca5506d9be2cc
Author: Dave Chinner <dchinner at redhat.com>
Date: Fri Apr 12 07:30:21 2013 +1000
xfs: add CRC checking to dir2 leaf blocks
This addition follows the same pattern as the dir2 block CRCs.
Seeing as both LEAF1 and LEAFN types need to changed at the same
time, this is a pretty large amount of change. leaf block headers
need to be abstracted away from the on-disk structures (struct
xfs_dir3_icleaf_hdr), as do the base leaf entry locations.
This header abstract allows the in-core header and leaf entry
location to be passed around instead of the leaf block itself. This
saves a lot of converting individual variables from on-disk format
to host format where they are used, so there's a good chance that
the compiler will be able to produce much more optimal code as it's
not having to byteswap variables all over the place.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 33363feed1614def83d0a6870051f0a7828cd61b
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:22 2013 +1100
xfs: add CRC checking to dir2 data blocks
This addition follows the same pattern as the dir2 block CRCs.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit cbc8adf89724b961c08b823d8bfb6dadbfa8733d
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:21 2013 +1100
xfs: add CRC checking to dir2 free blocks
This addition follows the same pattern as the dir2 block CRCs, but
with a few differences. The main difference is that the free block
header is different between the v2 and v3 formats, so an "in-core"
free block header has been added and _todisk/_from_disk functions
used to abstract the differences in structure format from the code.
This is similar to the on-disk superblock versus the in-core
superblock setup. The in-core strucutre is populated when the buffer
is read from disk, all the in memory checks and modifications are
done on the in-core version of the structure which is written back
to the buffer before the buffer is logged.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit f5f3d9b0161633e8943520e83df634ad540b3b7f
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:20 2013 +1100
xfs: add CRC checks to block format directory blocks
Now that directory buffers are made from a single struct xfs_buf, we
can add CRC calculation and checking callbacks. While there, add all
the fields to the on disk structures for future functionality such
as d_type support, uuids, block numbers, owner inode, etc.
To distinguish between the different on disk formats, change the
magic numbers for the new format directory blocks.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit f948dd76dde021c050c7c35720dc066a8b9a5e35
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:19 2013 +1100
xfs: add CRC checks to remote symlinks
Add a header to the remote symlink block, containing location and
owner information, as well as CRCs and LSN fields. This requires
verifiers to be added to the remote symlink buffers for CRC enabled
filesystems.
This also fixes a bug reading multiple block symlinks, where the second
block overwrites the first block when copying out the link name.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 19de7351a8eb82dc99745e60e8f43474831d99c7
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:18 2013 +1100
xfs: split out symlink code into it's own file.
The symlink code is about to get more complicated when CRCs are
added for remote symlink blocks. The symlink management code is
mostly self contained, so move it to it's own files so that all the
new code and the existing symlink code will not be intermingled
with other unrelated code.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 93848a999cf9b9e4f4f77dba843a48c393f33c59
Author: Christoph Hellwig <hch at lst.de>
Date: Wed Apr 3 16:11:17 2013 +1100
xfs: add version 3 inode format with CRCs
Add a new inode version with a larger core. The primary objective is
to allow for a crc of the inode, and location information (uuid and ino)
to verify it was written in the right place. We also extend it by:
a creation time (for Samba);
a changecount (for NFSv4);
a flush sequence (in LSN format for recovery);
an additional inode flags field; and
some additional padding.
These additional fields are not implemented yet, but already laid
out in the structure.
[dchinner at redhat.com] Added LSN and flags field, some factoring and rework to
capture all the necessary information in the crc calculation.
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 3fe58f30b4fc3f8a9084b035a02bc0c67bee8d00
Author: Christoph Hellwig <hch at lst.de>
Date: Wed Apr 3 16:11:16 2013 +1100
xfs: add CRC checks for quota blocks
Use the reserved space in struct xfs_dqblk to store a UUID and a crc
for the quota blocks.
[dchinner at redhat.com] Add a LSN field and update for current verifier
infrastructure.
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 983d09ffe396ed5d5339a1b9ff994dd0b0f2069f
Author: Dave Chinner <dgc at sgi.com>
Date: Wed Apr 3 16:11:15 2013 +1100
xfs: add CRC checks to the AGI
Same set of changes made to the AGF need to be made to the AGI.
This patch has a similar history to the AGF, hence a similar
sign-off chain.
Signed-off-by: Dave Chinner <dgc at sgi.com>
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dgc at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 77c95bba013089fa868217283eb6d98a05913e53
Author: Christoph Hellwig <hch at lst.de>
Date: Wed Apr 3 16:11:14 2013 +1100
xfs: add CRC checks to the AGFL
Add CRC checks, location information and a magic number to the AGFL.
Previously the AGFL was just a block containing nothing but the
free block pointers. The new AGFL has a real header with the usual
boilerplate instead, so that we can verify it's not corrupted and
written into the right place.
[dchinner at redhat.com] Added LSN field, reworked significantly to fit
into new verifier structure and growfs structure, enabled full
verifier functionality now there is a header to verify and we can
guarantee an initialised AGFL.
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 4e0e6040c4052aff15a494ac05778f4086d24c33
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:13 2013 +1100
xfs: add CRC checks to the AGF
The AGF already has some self identifying fields (e.g. the sequence
number) so we only need to add the uuid to it to identify the
filesystem it belongs to. The location is fixed based on the
sequence number, so there's no need to add a block number, either.
Hence the only additional fields are the CRC and LSN fields. These
are unlogged, so place some space between the end of the logged
fields and them so that future expansion of the AGF for logged
fields can be placed adjacent to the existing logged fields and
hence not complicate the field-derived range based logging we
currently have.
Based originally on a patch from myself, modified further by
Christoph Hellwig and then modified again to fit into the
verifier structure with additional fields by myself. The multiple
signed-off-by tags indicate the age and history of this patch.
Signed-off-by: Dave Chinner <dgc at sgi.com>
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit ee1a47ab0e77600fcbdf1c87d461bd8f3f63150d
Author: Christoph Hellwig <hch at lst.de>
Date: Sun Apr 21 14:53:46 2013 -0500
xfs: add support for large btree blocks
Add support for larger btree blocks that contains a CRC32C checksum,
a filesystem uuid and block number for detecting filesystem
consistency and out of place writes.
[dchinner at redhat.com] Also include an owner field to allow reverse
mappings to be implemented for improved repairability and a LSN
field to so that log recovery can easily determine the last
modification that made it to disk for each buffer.
[dchinner at redhat.com] Add buffer log format flags to indicate the
type of buffer to recovery so that we don't have to do blind magic
number tests to determine what the buffer is.
[dchinner at redhat.com] Modified to fit into the verifier structure.
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit a2050646f655a90400cbb66c3866d2e0137eee0c
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 16:11:11 2013 +1100
xfs: increase hexdump output in xfs_corruption_error
Currently xfs_corruption_error() dumps the first 16 bytes of the
buffer that is passed to it when a corruption occurs. This is not
large enough to see the entire state of the header of the block that
was determined to be corrupt. increase the output to 64 bytes to
capture the majority of all headers in all types of metadata blocks.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 7fe3258c50de383037102129c57df5cb66ab2000
Author: Jeff Liu <jeff.liu at oracle.com>
Date: Thu Apr 4 16:07:14 2013 +0800
xfs: Update xfs_log_commit_cil() comments
xfs_log_commit_iclog() function has been removed by commits 93b8a585:
xfs: remove the deprecated nodelaylog option
Beginning from Linux 3.3, only delayed logging is supported so that
we call xfs_log_commit_cil() at xfs_trans_commit() only, remove the
useless comments so.
Signed-off-by: Jie Liu <jeff.liu at oracle.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit d4fd0e92fbcfdba7bb2c789504a957ab8f835c38
Author: Jeff Liu <jeff.liu at oracle.com>
Date: Thu Apr 4 12:10:42 2013 +0800
xfs: Remove the obsolete XLOG_CIL_HARD_SPACE_LIMIT() macros
There is no more users of this Macro, so it's time to kill it dead.
Signed-off-by: Jie Liu <jeff.liu at oracle.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 666d644cd72a9ec58b353209ff191d7430f3b357
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Apr 3 14:09:21 2013 +1100
xfs: don't free EFIs before the EFDs are committed
Filesystems are occasionally being shut down with this error:
xfs_trans_ail_delete_bulk: attempting to delete a log item that is
not in the AIL.
It was diagnosed to be related to the EFI/EFD commit order when the
EFI and EFD are in different checkpoints and the EFD is committed
before the EFI here:
http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
The real problem is that a single bit cannot fully describe the
states that the EFI/EFD processing can be in. These completion
states are:
EFI EFI in AIL EFD Result
committed/unpinned Yes committed OK
committed/pinned No committed Shutdown
uncommitted No committed Shutdown
Note that the "result" field is what should happen, not what does
happen. The current logic is broken and handles the first two cases
correctly by luck. That is, the code will free the EFI if the
XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
inverted logic "works" because if both EFI and EFD are committed,
then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
bit, and the second frees the EFI item. Hence as long as
xfs_efi_item_committed() has been called, everything appears to be
fine.
It is the third case where the logic fails - where
xfs_efd_item_committed() is called before xfs_efi_item_committed(),
and that results in the EFI being freed before it has been
committed. That is the bug that triggered the shutdown, and hence
keeping track of whether the EFI has been committed or not is
insufficient to correctly order the EFI/EFD operations w.r.t. the
AIL.
What we really want is this: the EFI is always placed into the
AIL before the last reference goes away. The only way to guarantee
that is that the EFI is not freed until after it has been unpinned
*and* the EFD has been committed. That is, restructure the logic so
that the only case that can occur is the first case.
This can be done easily by replacing the XFS_EFI_COMMITTED with an
EFI reference count. The EFI is initialised with it's own count, and
that is not released until it is unpinned. However, there is a
complication to this method - the high level EFI/EFD code in
xfs_bmap_finish() does not hold direct references to the EFI
structure, and runs a transaction commit between the EFI and EFD
processing. Hence the EFI can be freed even before the EFD is
created using such a method.
Further, log recovery uses the AIL for tracking EFI/EFDs that need
to be recovered, but it uses the AIL *differently* to the EFI
transaction commit. Hence log recovery never pins or unpins EFIs, so
we can't drop the EFI reference count indirectly to free the EFI.
However, this doesn't prevent us from using a reference count here.
There is a 1:1 relationship between EFIs and EFDs, so when we
initialise the EFI we can take a reference count for the EFD as
well. This solves the xfs_bmap_finish() issue - the EFI will never
be freed until the EFD is processed. In terms of log recovery,
during the committing of the EFD we can look for the
XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
thereby ensuring everything works correctly there as well.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 3d6e036193bfa67a8a1cc1908fe910c7a014d183
Author: Rich Johnston <rjohnston at sgi.com>
Date: Wed Mar 27 09:26:49 2013 -0500
xfs: Add ratelimited printk for different alert levels
Ratelimited printk will be useful in printing xfs messages which are otherwise
not required to be printed always due to their high rate (to prevent kernel ring
buffer from overflowing), while at the same time required to be printed.
Signed-off-by: Raghavendra D Prabhu <rprabhu at wnohang.net>
Reviewed-by: Rich Johnston <rjohnston at sgi.com>
Reviewed-by: Dave Chinner <dchinner at redhat.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit ff9a28f6c25d18a635abcab1f49db68108203dfb
Author: Jan Kara <jack at suse.cz>
Date: Thu Mar 14 14:30:54 2013 +0100
xfs: Fix WARN_ON(delalloc) in xfs_vm_releasepage()
When a dirty page is truncated from a file but reclaim gets to it before
truncate_inode_pages(), we hit WARN_ON(delalloc) in
xfs_vm_releasepage(). This is because reclaim tries to write the page,
xfs_vm_writepage() just bails out (leaving page clean) and thus reclaim
thinks it can continue and calls xfs_vm_releasepage() on page with dirty
buffers.
Fix the issue by redirtying the page in xfs_vm_writepage(). This makes
reclaim stop reclaiming the page and also logically it keeps page in a
more consistent state where page with dirty buffers has PageDirty set.
Signed-off-by: Jan Kara <jack at suse.cz>
Reviewed-by: Carlos Maiolino <cmaiolino at redhat.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 19cb7e3854c9afe2ee968cbdd92293ec09e43bf3
Author: Brian Foster <bfoster at redhat.com>
Date: Mon Mar 18 10:51:48 2013 -0400
xfs: xfs_iomap_prealloc_size() tracepoint
Add a tracepoint to provide some feedback on preallocation size
calculation.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 76a4202a388690e664668c4f668ee12d709100b3
Author: Brian Foster <bfoster at redhat.com>
Date: Mon Mar 18 10:51:47 2013 -0400
xfs: add quota-driven speculative preallocation throttling
Introduce the need_throttle() and calc_throttle() functions to
independently check whether throttling is required for a particular
dquot and if so, calculate the associated throttling metrics based
on the state of the quota. We use the same general algorithm to
calculate the throttle shift as for global free space with the
exception of using three stages rather than five.
Update xfs_iomap_prealloc_size() to use the smallest available
prealloc size based on each of the constraints and apply the
maximum shift to obtain the throttled preallocation size.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit b136645116e5471cf0b037a1759dc83620236631
Author: Brian Foster <bfoster at redhat.com>
Date: Mon Mar 18 10:51:46 2013 -0400
xfs: xfs_dquot prealloc throttling watermarks and low free space
Enable tracking of high and low watermarks for preallocation
throttling of files under quota restrictions. These values are
calculated when the quota limit is read from disk or modified and
cached for later use by the throttling algorithm.
The high watermark specifies when preallocation is disabled, the
low watermark specifies when throttling is enabled and the low free
space data structure contains precalculated low free space limits
to serve as input to determine the level of throttling required.
Note that the low free space data structure is based on the
existing global low free space data structure with the exception of
using three stages (5%, 3% and 1%) rather than five to reduce the
impact of xfs_dquot memory overhead.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 4b6eae2e6ac8a6671839ccaea1c2e3dd5684f5df
Author: Brian Foster <bfoster at redhat.com>
Date: Mon Mar 18 10:51:45 2013 -0400
xfs: pass xfs_dquot to xfs_qm_adjust_dqlimits() instead of xfs_disk_dquot_t
Modify xfs_qm_adjust_dqlimits() to take the xfs_dquot as a
parameter instead of just the xfs_disk_dquot_t so we can update
in-memory fields if necessary.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit c9bdbdc0741d90908f492415c890b630f43f17f8
Author: Brian Foster <bfoster at redhat.com>
Date: Mon Mar 18 10:51:44 2013 -0400
xfs: push rounddown_pow_of_two() to after prealloc throttle
The round down occurs towards the beginning of the function. Push
it down after throttling has occurred. This is to support adding
further transformations to 'alloc_blocks' that might not preserve
power-of-two alignment (and thus could lead to rounding down
multiple times).
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 3c58b5f809eda8ae8d891b7a87d0a235ab0f9bf5
Author: Brian Foster <bfoster at redhat.com>
Date: Mon Mar 18 10:51:43 2013 -0400
xfs: reorganize xfs_iomap_prealloc_size to remove indentation
The majority of xfs_iomap_prealloc_size() executes within the
check for lack of default I/O size. Reverse the logic to remove the
extra indentation.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 56cea2d088811b8cf7d2893e29bcf369a912de69
Author: Christoph Hellwig <hch at lst.de>
Date: Tue Mar 12 23:30:36 2013 +1100
xfs: take inode version into account in XFS_LITINO
Add a version argument to XFS_LITINO so that it can return different values
depending on the inode version. This is required for the upcoming v3 inodes
with a larger fixed layout dinode.
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Ben Myers <bpm at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit c163f9a1760229a95d04e37b332de7d5c1c225cd
Author: Dave Chinner <dchinner at redhat.com>
Date: Tue Mar 12 23:30:34 2013 +1100
xfs: ensure we capture IO errors correctly
Failed buffer readahead can leave the buffer in the cache marked
with an error. Most callers that then issue a subsequent read on the
buffer do not zero the b_error field out, and so we may incorectly
detect an error during IO completion due to the stale error value
left on the buffer.
Avoid this problem by zeroing the error before IO submission. This
ensures that the only IO errors that are detected those captured
from are those captured from bio submission or completion.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit d8ddfe81c7e4fe41b8ec342cc288d58aecdf7c47
Author: Jeff Liu <jeff.liu at oracle.com>
Date: Mon Mar 11 14:31:02 2013 +0800
xfs: Remove obsoleted m_inode_shrink from xfs_mount structure
Looks the old m_inode_shrink is obsoleted as we perform inodes reclaim per AG via
m_reclaim_workqueue, this patch remove it from the xfs_mount structure if so.
Signed-off-by: Jie Liu <jeff.liu at oracle.com>
Cc: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit 9e5987a7792194ec338f53643237150c0db5f5e0
Author: Dave Chinner <dchinner at redhat.com>
Date: Mon Feb 25 12:31:26 2013 +1100
xfs: rearrange some code in xfs_bmap for better locality
xfs_bmap.c is a big file, and some of the related code is spread all
throughout the file requiring function prototypes for static
function and jumping all through the file to follow a single call
path. Rearrange the code so that:
a) related functionality is grouped together; and
b) functions are grouped in call dependency order
While the diffstat is large, there are no code changes in the patch;
it is just moving the functionality around and removing the function
prototypes at the top of the file. The resulting layout of the code
is as follows (top of file to bottom):
- miscellaneous helper functions
- extent tree block counting routines
- debug/sanity checking code
- bmap free list manipulation functions
- inode fork format manipulation functions
- internal/external extent tree seach functions
- extent tree manipulation functions used during allocation
- functions used during extent read/allocate/removal
operations (i.e. xfs_bmapi_write, xfs_bmapi_read,
xfs_bunmapi and xfs_getbmap)
This means that following logic paths through the bmapi code is much
simpler - most of the code relevant to a specific operation is now
clustered together rather than spread all over the file....
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit ecb3403de1efb56f78d9093376aec0a8af76b316
Author: Akinobu Mita <akinobu.mita at gmail.com>
Date: Mon Mar 4 21:58:20 2013 +0900
xfs: rename random32() to prandom_u32()
Use more preferable function name which implies using a pseudo-random
number generator.
Signed-off-by: Akinobu Mita <akinobu.mita at gmail.com>
Acked-by: <bpm at sgi.com>
Cc: Ben Myers <bpm at sgi.com>
Cc: Alex Elder <elder at kernel.org>
Cc: xfs at oss.sgi.com
Signed-off-by: Ben Myers <bpm at sgi.com>
commit d5929de8337fef46f3e307914ed0f3cb845e66c1
Author: Dave Chinner <dchinner at redhat.com>
Date: Wed Feb 27 13:25:54 2013 +1100
xfs: don't verify buffers after IO errors
When we read a buffer, we might get an error from the underlying
block device and not the real data. Hence if we get an IO error, we
shouldn't run the verifier but instead just pass the IO error
straight through.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Mark Tinguely <tinguely at sgi.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit e8108cedb1c5d1dc359690d18ca997e97a0061d2
Author: Mark Tinguely <tinguely at sgi.com>
Date: Sun Feb 24 13:04:37 2013 -0600
xfs: fix xfs_iomap_eof_prealloc_initial_size type
Fix the return type of xfs_iomap_eof_prealloc_initial_size() to
xfs_fsblock_t to reflect the fact that the return value may be an
unsigned 64 bits if XFS_BIG_BLKNOS is defined.
Signed-off-by: Mark Tinguely <tinguely at sgi.com>
Reviewed-by: Dave Chinner <dchinner at redhat.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit e114b5fce6befb8fa345d7cf1a4de8ce5a211910
Author: Brian Foster <bfoster at redhat.com>
Date: Tue Feb 19 10:24:41 2013 -0500
xfs: increase prealloc size to double that of the previous extent
The updated speculative preallocation algorithm for handling sparse
files can becomes less effective in situations with a high number of
concurrent, sequential writers. The number of writers and amount of
available RAM affect the writeback bandwidth slicing algorithm,
which in turn affects the block allocation pattern of XFS. For
example, running 32 sequential writers on a system with 32GB RAM,
preallocs become fixed at a value of around 128MB (instead of
steadily increasing to the 8GB maximum as sequential writes
proceed).
Update the speculative prealloc heuristic to base the size of the
next prealloc on double the size of the preceding extent. This
preserves the original aggressive speculative preallocation
behavior and continues to accomodate sparse files at a slight cost
of increasing the size of preallocated data regions following holes
of sparse files.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Dave Chinner <dchinner at redhat.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
commit e78c420bfc2608bb5f9a0b9165b1071c1e31166a
Author: Brian Foster <bfoster at redhat.com>
Date: Fri Feb 22 13:32:56 2013 -0500
xfs: fix potential infinite loop in xfs_iomap_prealloc_size()
If freesp == 0, we could end up in an infinite loop while squashing
the preallocation. Break the loop when we've killed the prealloc
entirely.
Signed-off-by: Brian Foster <bfoster at redhat.com>
Reviewed-by: Dave Chinner <dchinner at redhat.com>
Signed-off-by: Ben Myers <bpm at sgi.com>
-----------------------------------------------------------------------
Summary of changes:
fs/xfs/Makefile | 6 +-
fs/xfs/xfs_ag.h | 56 +-
fs/xfs/xfs_alloc.c | 201 +-
fs/xfs/xfs_alloc_btree.c | 105 +-
fs/xfs/xfs_alloc_btree.h | 12 +-
fs/xfs/xfs_aops.c | 12 +-
fs/xfs/xfs_attr.c | 454 +--
fs/xfs/xfs_attr.h | 1 -
fs/xfs/xfs_attr_leaf.c | 1781 +++++----
fs/xfs/xfs_attr_leaf.h | 122 +-
fs/xfs/xfs_attr_remote.c | 541 +++
fs/xfs/xfs_attr_remote.h | 46 +
fs/xfs/xfs_bmap.c | 9214 ++++++++++++++++++++++-----------------------
fs/xfs/xfs_bmap_btree.c | 110 +-
fs/xfs/xfs_bmap_btree.h | 19 +-
fs/xfs/xfs_btree.c | 256 +-
fs/xfs/xfs_btree.h | 64 +-
fs/xfs/xfs_buf.c | 4 +-
fs/xfs/xfs_buf_item.h | 64 +-
fs/xfs/xfs_da_btree.c | 1501 +++++---
fs/xfs/xfs_da_btree.h | 130 +-
fs/xfs/xfs_dinode.h | 43 +-
fs/xfs/xfs_dir2_block.c | 179 +-
fs/xfs/xfs_dir2_data.c | 266 +-
fs/xfs/xfs_dir2_format.h | 278 +-
fs/xfs/xfs_dir2_leaf.c | 898 +++--
fs/xfs/xfs_dir2_node.c | 1007 +++--
fs/xfs/xfs_dir2_priv.h | 50 +-
fs/xfs/xfs_dir2_sf.c | 12 +-
fs/xfs/xfs_dquot.c | 160 +-
fs/xfs/xfs_dquot.h | 16 +-
fs/xfs/xfs_error.c | 4 +-
fs/xfs/xfs_extfree_item.c | 27 +-
fs/xfs/xfs_extfree_item.h | 14 +-
fs/xfs/xfs_file.c | 2 +-
fs/xfs/xfs_fsops.c | 34 +-
fs/xfs/xfs_ialloc.c | 109 +-
fs/xfs/xfs_ialloc_btree.c | 87 +-
fs/xfs/xfs_ialloc_btree.h | 9 +-
fs/xfs/xfs_inode.c | 212 +-
fs/xfs/xfs_inode.h | 31 +-
fs/xfs/xfs_inode_item.c | 2 +-
fs/xfs/xfs_iomap.c | 163 +-
fs/xfs/xfs_linux.h | 1 +
fs/xfs/xfs_log.c | 2 +-
fs/xfs/xfs_log_cil.c | 4 -
fs/xfs/xfs_log_priv.h | 1 -
fs/xfs/xfs_log_recover.c | 246 +-
fs/xfs/xfs_message.h | 26 +
fs/xfs/xfs_mount.c | 146 +-
fs/xfs/xfs_mount.h | 2 +-
fs/xfs/xfs_qm.c | 25 +-
fs/xfs/xfs_qm.h | 4 +-
fs/xfs/xfs_qm_syscalls.c | 9 +-
fs/xfs/xfs_quota.h | 11 +-
fs/xfs/xfs_sb.h | 166 +-
fs/xfs/xfs_symlink.c | 730 ++++
fs/xfs/xfs_symlink.h | 66 +
fs/xfs/xfs_trace.c | 2 +-
fs/xfs/xfs_trace.h | 24 +
fs/xfs/xfs_trans_buf.c | 63 +-
fs/xfs/xfs_trans_dquot.c | 10 +-
fs/xfs/xfs_vnodeops.c | 478 +--
63 files changed, 12022 insertions(+), 8296 deletions(-)
create mode 100644 fs/xfs/xfs_attr_remote.c
create mode 100644 fs/xfs/xfs_attr_remote.h
create mode 100644 fs/xfs/xfs_symlink.c
create mode 100644 fs/xfs/xfs_symlink.h
hooks/post-receive
--
XFS development tree
More information about the xfs
mailing list