On Thu, Dec 01, 2011 at 01:51:28PM -0600, Ben Myers wrote:
> Process A reads from the grant reserve head at 2641 (and there currently is
> enough space)
> Process B wakes at either 2646 or 2650, in xlog_reserveq_wait, locks, and
> reads from the grant reserve head (and currently there is enough space)
> Process B removes itself from the list
> Process A reads from the reservq list and finds it to be empty
> Process A finds that there was enough space at 2646
> Process B returns from xlog_reserveq_wait, unlocks, grants space at 2656,
> Process A grants log space at 2656, and returns
> AFAICS there is nothing that prevents these guys from granting the same
> space when you approach free_bytes >= need_bytes concurrently.
> This lockless stuff is always a mind job for me. I'll take another look at
> some of the other aspects of the patch. Even if it doesn't resolve my
> question about the lockless issue, it seems to resolve Chandra's race.
Indeed, I think we have this race. Then again I I think we had
exactly the same one before, too. The only way to fix it would be
to do a sort of double cmpxchg that only moves the grant head forward
if it's still in available space vs the tails lsn.