On Wed, Oct 31, 2012 at 12:09:37PM -0500, Mark Tinguely wrote:
> On 10/29/12 19:20, Dave Chinner wrote:
> >From: Dave Chinner<dchinner@xxxxxxxxxx>
> >When we shut down the filesystem, we have to unpin and free all the
> >buffers currently active in the CIL. To do this we unpin and remove
> >them in one operation as a result of a failed iclogbuf write. For
> >buffers, we do this removal via a simultated IO completion of after
> >marking the buffer stale.
> >At the time we do this, we have two references to the buffer - the
> >active LRU reference and the buf log item. The LRU reference is
> >removed by marking the buffer stale, and the active CIL reference is
> >by the xfs_buf_iodone() callback that is run by
> >xfs_buf_do_callbacks() during ioend processing (via the bp->b_iodone
> >However, ioend processing requires one more reference - that of the
> >IO that it is completing. We don't have this reference, so we free
> >the buffer prematurely and use it after it is freed. This leads to
> >assert failures in xfs_buf_rele() on debug kernels because the
> >b_hold count is zero.
> >Fix this by making sure we take the necessary IO reference before
> >starting IO completion processing on the stale buffer.
> >Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx>
> This seems to take care of one of the ASSERT that I experienced after
> the worker move series.
It should - it's the shutdown failure you've reported for some time
> With this patch applied, there is a new ASSERT that the perag is not
> empty in filesystem unmount in test 179. I think this is related to
> the worker move series and not this patch. I will send the information
> in a different thread.
That's been around for a long, long time. I've never been able to
reproduce it reliably - I see it maybe once every couple of months -
so I've never been able to get to the bottom of it....
> Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>