This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "XFS development tree".
The branch, master has been updated
f936972 xfs: improve xfs_isilocked
070ecdc xfs: skip writeback from reclaim context
5b257b4 xfs: fix race in inode cluster freeing failing to stale inodes
from fb3b504adeee942e55393396fea8fdf406acf037 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
commit f9369729496a0f4c607a4cc1ea4dfeddbbfc505a
Author: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Thu Jun 3 16:22:29 2010 +1000
xfs: improve xfs_isilocked
Use rwsem_is_locked to make the assertations for shared locks work.
Signed-off-by: Christoph Hellwig <hch@xxxxxx>
Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
commit 070ecdca54dde9577d2697088e74e45568f48efb
Author: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Thu Jun 3 16:22:29 2010 +1000
xfs: skip writeback from reclaim context
Allowing writeback from reclaim context causes massive problems with stack
overflows as we can call into the writeback code which tends to be a heavy
stack user both in the generic code and XFS from random contexts that
perform memory allocations.
Follow the example of btrfs (and in slightly different form ext4) and refuse
to write out data from reclaim context. This issue should really be handled
by the VM so that we can tune better for this case, but until we get it
sorted out there we have to hack around this in each filesystem with a
complex writeback path.
Signed-off-by: Christoph Hellwig <hch@xxxxxx>
Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
commit 5b257b4a1f9239624c6b5e669763de04e482c2b3
Author: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu Jun 3 16:22:29 2010 +1000
xfs: fix race in inode cluster freeing failing to stale inodes
When an inode cluster is freed, it needs to mark all inodes in memory as
XFS_ISTALE before marking the buffer as stale. This is eeded because the
inodes
have a different life cycle to the buffer, and once the buffer is torn down
during transaction completion, we must ensure none of the inodes get written
back (which is what XFS_ISTALE does).
Unfortunately, xfs_ifree_cluster() has some bugs that lead to inodes not
being
marked with XFS_ISTALE. This shows up when xfs_iflush() is called on these
inodes either during inode reclaim or tail pushing on the AIL. The buffer
is
read back, but no longer contains inodes and so triggers assert failures and
shutdowns. This was reproducable with at run.dbench10 invocation from
xfstests.
There are two main causes of xfs_ifree_cluster() failing. The first is
simple -
it checks in-memory inodes it finds in the per-ag icache to see if they are
clean without holding the flush lock. if they are clean it skips them
completely. However, If an inode is flushed delwri, it will
appear clean, but is not guaranteed to be written back until the flush lock
has
been dropped. Hence we may have raced on the clean check and the inode may
actually be dirty. Hence always mark inodes found in memory stale before we
check properly if they are clean.
The second is more complex, and makes the first problem easier to hit.
Basically the in-memory inode scan is done with full knowledge it can be
racing
with inode flushing and AIl tail pushing, which means that inodes that it
can't
get the flush lock on might not be attached to the buffer after then
in-memory
inode scan due to IO completion occurring. This is actually documented in
the
code as "needs better interlocking". i.e. this is a zero-day bug.
Effectively, the in-memory scan must be done while the inode buffer is
locked
and Io cannot be issued on it while we do the in-memory inode scan. This
ensures that inodes we couldn't get the flush lock on are guaranteed to be
attached to the cluster buffer, so we can then catch all in-memory inodes
and
mark them stale.
Now that the inode cluster buffer is locked before the in-memory scan is
done,
there is no need for the two-phase update of the in-memory inodes, so
simplify
the code into two loops and remove the allocation of the temporary buffer
used
to hold locked inodes across the phases.
Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
Reviewed-by: Christoph Hellwig <hch@xxxxxx>
-----------------------------------------------------------------------
Summary of changes:
fs/xfs/linux-2.6/xfs_aops.c | 15 +++++
fs/xfs/xfs_iget.c | 26 +++-----
fs/xfs/xfs_inode.c | 142 +++++++++++++++++++------------------------
3 files changed, 87 insertions(+), 96 deletions(-)
hooks/post-receive
--
XFS development tree
|