xfs
[Top] [All Lists]

Re: [PATCH v3] xfs: free the efi AIL entry on log recovery failure

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH v3] xfs: free the efi AIL entry on log recovery failure
From: Mark Tinguely <tinguely@xxxxxxx>
Date: Mon, 23 Dec 2013 10:42:25 -0600
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20131211113126.GA6248@xxxxxxxxxxxxx>
References: <20131206212037.560711585@xxxxxxx> <20131208005224.696001432@xxxxxxx> <20131211113126.GA6248@xxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120122 Thunderbird/9.0
Sorry it took me so long to get to this.

I like the idea to keep the majority of EFI items off the AIL, but it does not simply the EFI recovery processing. xfs_recover_process_efi() will place the EFI entry on the AIL and recreate the hanging on unmount problem when there is an error in the extent free or transaction commit.

Since the patch removes the XFS_EFI_RECOVERED flag/code, it is harder to remove this entry. If you leave the XFS_EFI_RECOVERED flag/code in XFS, the patch to avoid a hang in xfs_ail_push_all_sync() will be the same as before.

As you commented, it does leak some memory, which can be easily fixed...

On 12/11/13 05:31, Christoph Hellwig wrote

Use a cancellation table, similar to how we handle buffers instead of
abusing the AIL during recovery.

Signed-off-by: Christoph Hellwig<hch@xxxxxx>
---

...

@@ -3059,82 +3065,50 @@ xlog_recover_efi_pass2(
        struct xlog_recover_item        *item,
        xfs_lsn_t                       lsn)
  {
-       int                     error;
-       xfs_mount_t             *mp = log->l_mp;
-       xfs_efi_log_item_t      *efip;
-       xfs_efi_log_format_t    *efi_formatp;
+       struct xfs_efi_cancel           *ecp;
+       int                             error;
+       xfs_mount_t                     *mp = log->l_mp;
+       xfs_efi_log_format_t            *efi_formatp;

typedef got leaked back in.

...

@@ -3618,27 +3592,26 @@ xlog_recover_process_data(
  }

  /*
- * Process an extent free intent item that was recovered from
- * the log.  We need to free the extents that it describes.
+ * Process an extent free intent item that was recovered from the log.
+ * We need to free the extents that it describes.
   */
  STATIC int
  xlog_recover_process_efi(
-       xfs_mount_t             *mp,
-       xfs_efi_log_item_t      *efip)
+       struct xlog             *log,
+       struct xfs_efi_cancel   *ecp)
  {
-       xfs_efd_log_item_t      *efdp;
-       xfs_trans_t             *tp;
-       int                     i;
-       int                     error = 0;
-       xfs_extent_t            *extp;
+       struct xfs_mount        *mp = log->l_mp;
+       struct xfs_efi_log_item *efip = ecp->ec_efip;
+       struct xfs_efd_log_item *efdp;
+       struct xfs_trans        *tp;
+       struct xfs_extent       *extp;
        xfs_fsblock_t           startblock_fsb;
-
-       ASSERT(!test_bit(XFS_EFI_RECOVERED,&efip->efi_flags));
+       int                     error = 0;
+       int                     i;

        /*
-        * First check the validity of the extents described by the
-        * EFI.  If any are bad, then assume that all are bad and
-        * just toss the EFI.
+        * First check the validity of the extents described by the EFI.
+        * If any are bad, then assume that all are bad and just toss the EFI.
         */
        for (i = 0; i<  efip->efi_format.efi_nextents; i++) {
                extp =&(efip->efi_format.efi_extents[i]);
@@ -3652,12 +3625,14 @@ xlog_recover_process_efi(
                         * This will pull the EFI from the AIL and
                         * free the memory associated with it.
                         */
-                       set_bit(XFS_EFI_RECOVERED,&efip->efi_flags);
                        xfs_efi_release(efip, efip->efi_format.efi_nextents);
                        return XFS_ERROR(EIO);
                }
        }

Soo at this point we are dealing with the EFIs that do not have matching EFD. The EFI are in the new l_efi_cancel and not on the AIL. I am correct in thinking that there is no reason to do a xfs_efi_release() above?

+       spin_lock(&log->l_ailp->xa_lock);
+       xfs_trans_ail_update(log->l_ailp,&efip->efi_item, ecp->ec_lsn);

we now added the EFI to the AIL. If anything fails in the xfs_free_extent()/xfs_trans_commit then EFI has to be removed which is now trickier without the XFS_EFI_RECOVERED flag.

I suppose we could:

 1) leave the XFS_EFI_RECOVERED counter/code.
 2) manually decrease the counter. *
 3) manually remove the entry from the AIL. *
* terrible hack that everyone will NACK.

- */
  STATIC int
  xlog_recover_process_efis(
        struct xlog     *log)
  {
...

-               spin_lock(&ailp->xa_lock);
+       list_for_each_entry_safe(ecp, next,&log->l_efi_cancel, ec_list) {
+               list_del(&ecp->ec_list);
+               error = xlog_recover_process_efi(log, ecp);
                if (error)
-                       goto out;
-               lip = xfs_trans_ail_cursor_next(ailp,&cur);
+                       break; /* XXX: will leak remaining EFIs.. */

...rather than leaking the rest of the efi cancel entries:

                /* comment here about removing entries on error */
                if (!error)
                        error = xlog_recover_process_efi(log, ecp);

Again, sorry for the delay. I will wait to see what you want to do with this patch before resubmitting the patch to free EFI from the AIL.

--Mark.



<Prev in Thread] Current Thread [Next in Thread>