On Mon, Sep 30, 2013 at 09:37:05AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> Removing an inode from the namespace involves removing the directory
> entry and dropping the link count on the inode. Removing the
> directory entry can result in locking an AGF (directory blocks were
> freed) and removing a link count can result in placing the inode on
> an unlinked list which results in locking an AGI.
> The big problem here is that we have an ordering constraint on AGF
> and AGI locking - inode allocation locks the AGI, then can allocate
> a new extent for new inodes, locking the AGF after the AGI.
> Similarly, freeing the inode removes the inode from the unlinked
> list, requiring that we lock the AGI first, and then freeing the
> inode can result in an inode chunk being freed and hence freeing
> disk space requiring that we lock an AGF.
> Hence the ordering that is imposed by other parts of the code is AGI
> before AGF. This means we cannot remove the directory entry before
> we drop the inode reference count and put it on the unlinked list as
> this results in a lock order of AGF then AGI, and this can deadlock
> against inode allocation and freeing. Therefore we must drop the
> link counts before we remove the directory entry.
> This is still safe from a transactional point of view - it is not
> until we get to xfs_bmap_finish() that we have the possibility of
> multiple transactions in this operation. Hence as long as we remove
> the directory entry and drop the link count in the first transaction
> of the remove operation, there are no transactional constraints on
> the ordering here.
> Change the ordering of the operations in the xfs_remove() function
> to align the ordering of AGI and AGF locking to match that of the
> rest of the code.
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
Hmmm.. I'm not quite comfortable with this one yet... It'll probably be
better after some more review. Did you happen to have a test case to go with