xfs
[Top] [All Lists]

Re: [PATCH] xfs_repair: don't unlock prefetch tree to read discontig buf

To: Eric Sandeen <sandeen@xxxxxxxxxx>
Subject: Re: [PATCH] xfs_repair: don't unlock prefetch tree to read discontig buffers
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 8 May 2014 11:42:15 +1000
Cc: xfs-oss <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <536AC1CB.8060601@xxxxxxxxxx>
References: <536AC1CB.8060601@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, May 07, 2014 at 06:29:15PM -0500, Eric Sandeen wrote:
> The way discontiguous buffers are currently handled in
> prefetch is by unlocking the prefetch tree and reading
> them one at a time in pf_read_discontig(), inside the
> normal loop of searching for buffers to read in a more
> optimized fashion.
> 
> But by unlocking the tree, we allow other threads to come
> in and find buffers which we've already stashed locally
> on our bplist[].  If 2 threads think they own the same
> set of buffers, they may both try to delete them from
> the prefetch btree, and the second one to arrive will not
> find it, resulting in:
> 
>       fatal error -- prefetch corruption
> 
> Fix this by maintaining 2 lists; the original bplist,
> and a new one containing only discontiguous buffers.
> 
> The original list can be seek-optimized as before,
> and the discontiguous list can be read one by one
> before we do the seek-optimized reads, after all of the
> tree manipulation has been completed.

Nice job finding the problem, Eric! It looks like your patch solves
the problem, but after considering this approach for a while I think
it's overkill. ;)

What the loop is trying to do is linearise all the IO and turn lots
of small IO into a single large IO, so if we grab all the discontig
buffers in the range, then do IO on them, then do the large IO, we
are effectively seeking all over that range, including backwards.
This is exactly the sort of problem the prefetch loop is trying to
avoid.

So what I think is best is that we simply abort the pulling of new
buffers off the list when we hit a discontiguous buffer. Leave the
discontig buffer as the last on the list, and process the list as
per normal. Remove all the remaining buffers from the btree, then
drop the lock and do the pread64 call.

Then, check the last buffer on the bplist - if it's the discontig
buffer (i.e. wasn't dropped during list processing), then issue the
discontig buffer IO. It should at least start as either sequential I
oor with a small forwards seek, so so shoul be as close to seek
optimised as we can get for such buffers. Then it can be removed
from the bplist, num decremented, the lock picked back up and the
large buffer read in via pread64() can be sliced and diced
appropriately...

i.e. much less code, no need for a separate list, and the seeks
shoul dbe minimised as much as possible....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>