xfs
[Top] [All Lists]

Re: [PATCH 2/2] xfs: fix efi item leak on forced shutdown

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH 2/2] xfs: fix efi item leak on forced shutdown
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 19 Jan 2011 10:33:46 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20110118124625.GB12516@xxxxxxxxxxxxx>
References: <1295010430-12495-1-git-send-email-david@xxxxxxxxxxxxx> <1295010430-12495-3-git-send-email-david@xxxxxxxxxxxxx> <20110118124625.GB12516@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Jan 18, 2011 at 07:46:25AM -0500, Christoph Hellwig wrote:
> On Sat, Jan 15, 2011 at 12:07:10AM +1100, Dave Chinner wrote:
> > The cause of the leak is that the "remove" parameter of IOP_UNPIN()
> > is never set when a CIL push is aborted. This means that the EFI
> > item is never freed if it was in the push being cancelled. The
> > problem is specific to delayed logging.
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> >  fs/xfs/xfs_trans.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > index f80a067..e66ce5e 100644
> > --- a/fs/xfs/xfs_trans.c
> > +++ b/fs/xfs/xfs_trans.c
> > @@ -1472,6 +1472,16 @@ xfs_trans_committed_bulk(
> >             if (XFS_LSN_CMP(item_lsn, (xfs_lsn_t)-1) == 0)
> >                     continue;
> >  
> > +           /*
> > +            * if we are aborting the operation, no point in inserting the
> > +            * object into the AIL as we areee in a shutdown situation.
> 
> that's a few 'e' too much.
> 
> > +            */
> > +           if (aborted) {
> > +                   ASSERT(XFS_FORCED_SHUTDOWN(ailp->xa_mount));
> > +                   IOP_UNPIN(lip, aborted);
> > +                   continue;
> > +           }
> 
> Hmm, this is not symmetric with the non-delaylog path.
> xfs_trans_item_committed never sets the remove flag to IOP_UNPIN,
> even if the transaction commit was aborted.

Right, because the delaylog and non-delaylog paths are not symmetric
w.r.t. log write failures.

> It seems like the CIL code is missing an equivalent to
> xfs_trans_uncommit for the case that xfs_log_write or xfs_log_done
> fail.

There isn't an equivalent. In the delaylog case, we don't have a
transaction to "uncommit" when a log write failure occurs - we are
aborting the checkpoint of the CIL, not a transaction. As the items
have already gone through IOP_COMMITTING and IOP_UNLOCK, we have to
treat the failures like they came from the log IO completion
handler.

In the case of non-delaylog, neither IOP_COMMITTING or IOP_UNLOCK
has been called on the items when the xfs_log_write() fails. They
are still linked into the xfs_trans structure, so they can be
handled by xfs_trans_uncommit() which simply needs to walk the items
in the transaction and  IOP_UNPIN(lip, abort), IOP_UNLOCK and free
the items.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>