On Wed, Mar 23, 2011 at 10:47:28AM +1100, Dave Chinner wrote:
> BUG_ON() inside a spinlock will effectively lock up the machine as
> other CPUs try to take the spinlock that was held by the thread that
> went bad. It causes the machine to die a horrible, undebuggable death
> rather than just kill the thread that caused the oops and leaving
> everything else still functional.
>
> If it were an ASSERT(), that's probably OK because it is only debug
> systems that woul dhave this problem, but the use of BUG(_ON) means
> it will cause production systems to die as well.
I've changed it to asserts.
> > + * We would have to split the busy extent to be able
> > + * to track it correct, which we cannot do because we
> > + * would have to modify the list of busy extents
> > + * attached to the transaction/CIL context, which
> > + * is mutable.
>
> immutable?
Yes.
> > + *
> > + * Force out the log to clear the busy extents
> > + * and retry the search.
> > + */
>
> BTW, these comments appear to wrap at 68 columns - any particular
> reason?
They used to sit one more tab to the right and I didn't fix them up
when splitting the function. I've rewrapped them now.
> > + overlap = xfs_alloc_busy_try_reuse(pag, busyp,
> > + fbno, fbno + flen);
> > + if (overlap) {
> > + spin_unlock(&pag->pagb_lock);
> > + xfs_log_force(tp->t_mountp, XFS_LOG_SYNC);
> > + goto restart;
> > + }
>
> xfs_alloc_busy_try_reuse() only ever returns -1 or 1, which means
> this will always trigger a log force on overlap....
At least for now, yes. In the next patch we will treat -1 vs 1
differently. Maybe it should be changed to 0 vs 1, but even that
isn't quite intuitive. I'll think about it a bit more.
|