On Thu, May 08, 2014 at 09:14:31PM -0500, Eric Sandeen wrote:
> On 5/8/14, 8:17 PM, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > The way discontiguous buffers are currently handled in prefetch is
> > by unlocking the prefetch tree and reading them one at a time in
> > pf_read_discontig(), inside the normal loop of searching for buffers
> > to read in a more optimized fashion.
> > But by unlocking the tree, we allow other threads to come in and
> > find buffers which we've already stashed locally on our bplist.
> > If 2 threads think they own the same set of buffers, they may both
> > try to delete them from the prefetch btree, and the second one to
> > arrive will not find it, resulting in:
> > fatal error -- prefetch corruption
> > To fix this, simply abort the buffer gathering loop when we come
> > across a discontiguous buffer, process the gathered list as per
> > normal, and then after running the large optimised read, check to
> > see if the last buffer on the list is a discontiguous buffer.
> > If is is discontiguous, then issue the discontiguous buffer read
> > while the locks are not held. We only ever have one discontiguous
> > buffer per read loop, so it is safe just to check the last buffer in
> > the list.
> > The fix is loosely based on a a patch provided by Eric Sandeen, who
> > did all the hard work of finding the bug and demonstrating how to
> > fix it.
> Ok, this makes sense to me. The comment above the discontig read
> seems a bit confusing; you say it's safe to read while unlocked,
> but I wouldn't have expected it not to be - the lock is just for
> btree manipulation, and that's not being done. So I think the
> comment adds a little confusion rather than clarification.
Ok, I'll just drop the bit about it being safe to read - the bit
about being the last buffer on the list is the important bit...