xfs
[Top] [All Lists]

Re: [PATCH] xfs_repair: fix prefetch queue waiting

To: Brian Foster <bfoster@xxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxx>
Subject: Re: [PATCH] xfs_repair: fix prefetch queue waiting
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Tue, 08 Apr 2014 08:36:19 -0500
Cc: xfs-oss <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140408125832.GA23051@xxxxxxxxxxxxxxx>
References: <53436CA0.1090106@xxxxxxxxxx> <20140408125832.GA23051@xxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
On 4/8/14, 7:58 AM, Brian Foster wrote:
> On Mon, Apr 07, 2014 at 10:27:28PM -0500, Eric Sandeen wrote:
>> This fixes a regression caused by:
>>
>> 97b1fcf xfs_repair: fix array overrun in do_inode_prefetch
>>
>> The thread creation loop has 2 ways to exit; either via
>> the loop counter based on thread_count, or the break statement
>> if we've started enough workers to cover all AGs.
>>
>> Whether or not the loop counter "i" reflects the number of
>> threads started depends on whether or not we exited via the
>> break.
>>
>> The above commit prevented us from indexing off the end
>> of the queues[] array if we actually advanced "i" all the
>> way to thread_count, but in the case where we break, "i"
>> is one *less* than the nr of threads started, so we don't
>> wait for completion of all threads, and all hell breaks
>> loose in phase 5.
>>
>> Just stop with the cleverness of re-using the loop counter -
>> instead, explicitly count threads that we start, and then use
>> that counter to wait for each worker to complete.
>>
>> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
>> ---
>>
>> I have one fs which demonstrates the problem, and have verified
>> the regression & tested the fix against that.
>>
>> I'll run this over xfstests overnight, but it seems obvious
>> from here (OTOH the other fix seemed obvious too) :(
>>
>> diff --git a/repair/prefetch.c b/repair/prefetch.c
>> index e47a48e..4c32395 100644
>> --- a/repair/prefetch.c
>> +++ b/repair/prefetch.c
>> @@ -944,6 +944,7 @@ do_inode_prefetch(
>>      int                     i;
>>      struct work_queue       queue;
>>      struct work_queue       *queues;
>> +    int                     queues_started = 0;
>>  
>>      /*
>>       * If the previous phases of repair have not overflowed the buffer
>> @@ -987,6 +988,7 @@ do_inode_prefetch(
>>  
>>              create_work_queue(&queues[i], mp, 1);
>>              queue_work(&queues[i], prefetch_ag_range_work, 0, wargs);
>> +            queues_started++;
>>  
>>              if (wargs->end_ag >= mp->m_sb.sb_agcount)
>>                      break;
>> @@ -995,7 +997,7 @@ do_inode_prefetch(
>>      /*
>>       * wait for workers to complete
>>       */
>> -    while (i--)
>> +    for (i = 0; i < queues_started; i++)
>>              destroy_work_queue(&queues[i]);
> 
> Fix looks good, but any reason to reverse the order of the destroy loop?

simplicity? :)

I don't think it matters operationally...

-Eric
 

<Prev in Thread] Current Thread [Next in Thread>