xfs
[Top] [All Lists]

Re: [PATCH] xfs_repair: fix prefetch queue waiting

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [PATCH] xfs_repair: fix prefetch queue waiting
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Tue, 8 Apr 2014 09:52:40 -0400
Cc: Eric Sandeen <sandeen@xxxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5343FB53.6030902@xxxxxxxxxxx>
References: <53436CA0.1090106@xxxxxxxxxx> <20140408125832.GA23051@xxxxxxxxxxxxxxx> <5343FB53.6030902@xxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Apr 08, 2014 at 08:36:19AM -0500, Eric Sandeen wrote:
> On 4/8/14, 7:58 AM, Brian Foster wrote:
> > On Mon, Apr 07, 2014 at 10:27:28PM -0500, Eric Sandeen wrote:
> >> This fixes a regression caused by:
> >>
> >> 97b1fcf xfs_repair: fix array overrun in do_inode_prefetch
> >>
> >> The thread creation loop has 2 ways to exit; either via
> >> the loop counter based on thread_count, or the break statement
> >> if we've started enough workers to cover all AGs.
> >>
> >> Whether or not the loop counter "i" reflects the number of
> >> threads started depends on whether or not we exited via the
> >> break.
> >>
> >> The above commit prevented us from indexing off the end
> >> of the queues[] array if we actually advanced "i" all the
> >> way to thread_count, but in the case where we break, "i"
> >> is one *less* than the nr of threads started, so we don't
> >> wait for completion of all threads, and all hell breaks
> >> loose in phase 5.
> >>
> >> Just stop with the cleverness of re-using the loop counter -
> >> instead, explicitly count threads that we start, and then use
> >> that counter to wait for each worker to complete.
> >>
> >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> >> ---
> >>
> >> I have one fs which demonstrates the problem, and have verified
> >> the regression & tested the fix against that.
> >>
> >> I'll run this over xfstests overnight, but it seems obvious
> >> from here (OTOH the other fix seemed obvious too) :(
> >>
> >> diff --git a/repair/prefetch.c b/repair/prefetch.c
> >> index e47a48e..4c32395 100644
> >> --- a/repair/prefetch.c
> >> +++ b/repair/prefetch.c
> >> @@ -944,6 +944,7 @@ do_inode_prefetch(
> >>    int                     i;
> >>    struct work_queue       queue;
> >>    struct work_queue       *queues;
> >> +  int                     queues_started = 0;
> >>  
> >>    /*
> >>     * If the previous phases of repair have not overflowed the buffer
> >> @@ -987,6 +988,7 @@ do_inode_prefetch(
> >>  
> >>            create_work_queue(&queues[i], mp, 1);
> >>            queue_work(&queues[i], prefetch_ag_range_work, 0, wargs);
> >> +          queues_started++;
> >>  
> >>            if (wargs->end_ag >= mp->m_sb.sb_agcount)
> >>                    break;
> >> @@ -995,7 +997,7 @@ do_inode_prefetch(
> >>    /*
> >>     * wait for workers to complete
> >>     */
> >> -  while (i--)
> >> +  for (i = 0; i < queues_started; i++)
> >>            destroy_work_queue(&queues[i]);
> > 
> > Fix looks good, but any reason to reverse the order of the destroy loop?
> 
> simplicity? :)
> 

Fine by me. :)

Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx>

> I don't think it matters operationally...
> 
> -Eric
>  
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>