On 03/12/14 18:06, Dave Chinner wrote:
On Wed, Mar 12, 2014 at 04:43:26PM -0500, Mark Tinguely wrote:
On 03/12/14 15:14, Mears, Morgan wrote:
I was unable to umount, even with -f; failed with EBUSY and couldn't unbusy
as the fs was unresponsive (and happens to contain the Oracle management
tools necessary to close all open descriptors). Accordingly I rebooted.
You are the second person in 2-3 weeks to hit this unmount issue.
Unmatched EFI in the AIL keeps the unmount from completing.
Jeff are you still looking at this?
I'd say the answer is no. Last time you pointed out this problem I
last asked you to provide patches to fix the problem, mark. Can
you please provide patches to fix this, Mark?
Ah, you wanted me to fix the cil_push error issue that leaks ctx and
does not wake up the waiters. This is only a side issue to that.
We can easily patch the xfs_bmap_finish() and
xlog_recover_process_efis() code. I have that patch and tested it. But
it does not cover the cases of a cil push error nor the successful
xfs_bmap_finish() and the EFI is in the AIL but the EFD is discarded.
The most correct thing to do is clear the EFI from the AIL in the abort
paths of xfs_efd_item_committed() and xfs_efd_item_unlock(), but those
will be called many times and would be overkill.
A less correct but easier would be clear the EFIs from the AIL once in
xfs_unmountfs() after the last force of the log and before the
xfs_ail_push_all_sync(). Since the EFI are removed very late, then we
don't have to special case the removal in xfs_bmap_finish() and the
xlog_recover_process_efis(). This is why I was waiting to see what Jeff
wanted to do.
If I hear no strong objection, I intend to put the clearing EFI on the
AIL for each situation: the abort cases of the efd iop routines, in
xfs_bmap_finish() and the xlog_recover_process_efis().