xfs
[Top] [All Lists]

Re: [PATCH 01/15] xfs: xfs_remove deadlocks due to inverted AGF vs AGI l

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 01/15] xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering
From: Ben Myers <bpm@xxxxxxx>
Date: Wed, 30 Oct 2013 17:39:04 -0500
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1383045118-31107-2-git-send-email-david@xxxxxxxxxxxxx>
References: <1383045118-31107-1-git-send-email-david@xxxxxxxxxxxxx> <1383045118-31107-2-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Oct 29, 2013 at 10:11:44PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Removing an inode from the namespace involves removing the directory
> entry and dropping the link count on the inode. Removing the
> directory entry can result in locking an AGF (directory blocks were
> freed) and removing a link count can result in placing the inode on
> an unlinked list which results in locking an AGI.
> 
> The big problem here is that we have an ordering constraint on AGF
> and AGI locking - inode allocation locks the AGI, then can allocate
> a new extent for new inodes, locking the AGF after the AGI.
> Similarly, freeing the inode removes the inode from the unlinked
> list, requiring that we lock the AGI first, and then freeing the
> inode can result in an inode chunk being freed and hence freeing
> disk space requiring that we lock an AGF.
> 
> Hence the ordering that is imposed by other parts of the code is AGI
> before AGF. This means we cannot remove the directory entry before
> we drop the inode reference count and put it on the unlinked list as
> this results in a lock order of AGF then AGI, and this can deadlock
> against inode allocation and freeing. Therefore we must drop the
> link counts before we remove the directory entry.
> 
> This is still safe from a transactional point of view - it is not
> until we get to xfs_bmap_finish() that we have the possibility of
> multiple transactions in this operation. Hence as long as we remove
> the directory entry and drop the link count in the first transaction
> of the remove operation, there are no transactional constraints on
> the ordering here.
> 
> Change the ordering of the operations in the xfs_remove() function
> to align the ordering of AGI and AGF locking to match that of the
> rest of the code.
> 
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>

These two codepaths look plausible for the deadlock you described:

inode allocation locking:
xfs_create
  xfs_dir_ialloc
    xfs_ialloc
      xfs_dialloc
        xfs_ialloc_read_agi              * takes agi
        xfs_ialloc_ag_alloc
          xfs_alloc_vextent
            xfs_alloc_fix_freelist
              xfs_alloc_read_agf         * takes agf

vs

xfs_remove
  xfs_dir_removename
    xfs_dir2_node_removename
      xfs_dir2_leafn_remove
        xfs_dir2_shrink_inode
          xfs_bunmapi
          . xfs_bmap_del_extent
          .   xfs_btree_delete
          .     xfs_btree_delrec
          .       .free_block
          .         xfs_bmbt_free_block
          .           xfs_bmap_add_free  * adds to free list, doesn't take agf
            xfs_bmap_extents_to_btree
              xfs_alloc_vextent          * takes agf
  xfs_droplink
    xfs_iunlink
      xfs_read_agi       * takes agi

I was thinking I'd find something in .free_block, but I didn't.  But it does
look like we'll take the agf if we have to convert between directory formats in
xfs_dir2_leafn_remove, and it looks like there are a few more opportunities to
take the agf in xfs_bunmapi...

Looks good.

Reviewed-by: Ben Myers <bpm@xxxxxxx>

<Prev in Thread] Current Thread [Next in Thread>