On Tue, Nov 06, 2012 at 07:59:49AM -0500, Christoph Hellwig wrote:
> On Sat, Nov 03, 2012 at 10:47:41AM +1100, Dave Chinner wrote:
> > I think that's irrelevant here - there will *never* be an IO waiter
> > at this point in time. This processing is in log buffer IO
> > completion context, so the buffers are still pinned in memory. Hence
> > anyone trying to do IO on it will be waiting in xfs_buf_wait_unpin()
> > and never get to xfs_buf_iowait(). And because xfs_buf_wait_unpin()
> > is called with the buffer lock held, we'll never do the failure
> > handling in xfs_buf_item_unpin until the buffer IO is completed and
> > it is unlocked.
> How do we manage to submit it synchronously then?
I don't follow what problem you are talking about here.
Fundamentally, races with IO are resolved like this regardless of
whether the racing Io is sync or async
<shutdown, no new buffers can get to xfs_buf_iorequest>
What this also points out is that we shoul dbe checking for shutdown
after xfs_buf_wait_unpin(), too, because otherwise we are submitting
IO after the shutdown is initiated....
> The inode and dquot
> reclaim xfs_bwrite calls already wait for an unpin first, so I don't
> think these are the problem. The only other call on a live fs seems
> to xfs_qm_shake -> xfs_buf_delwri_submit, but that one does wait
> for the complete() call on b_iowait. I suspect we are hitting that
> and due to it skipping the wait if b_ioerror is set and waiting on
> multiple buffers that complete together might hide the issue.
We are in a shutdown situation. xfs_buf_delwri_submit() goes via
xfs_bdstrat_cb() which will stop any new IOs from being submitted
via this path. If it is blocked on the above case, then it is
also resolved by the above case...
> __xfs_buf_delwri_submit for the wait == true case also seems to be
> the only place that actually skips the ispinned check.
Sure, but that we're in a shutdown situation, so it doesn't matter -
the buffer will never get to xfs_buf_wait_unpin() because of the
shutdown check in xfs_bdstrat_cb().
I still don't see the problem you are trying to explain to me. Maybe
I'm just being dense....