Barry Naujok wrote:
AG stride testing was performed on a system with ample amounts
of memory, so prefetching with AG stride during Phase 4 was
missed. The attached patch fixes this.
32 AGs, ag_stride = 4:
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 4
- agno = 8
- agno = 12
- agno = 16
- agno = 20
- agno = 24
- agno = 28
which is correct... but in Phase 4:
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 5
- agno = 6
- agno = 4
- agno = 7
- agno = 3
Okay I replied to this in the bug but forgot about the review
email request so I will reply here too:)
==========================
ADDITIONAL INFORMATION (ADD)
From: timothy shimmin <tes@xxxxxxx>
Date: Aug 09 2007 10:59:01PM
[BugWorks mailnews processor v1.7.1]
==========================
bnaujok@xxxxxxx via BugWorks wrote:
>
> [ EXCESSIVE QUOTED TEXT DELETED ]
>
> - 12:00:16: setting up duplicate extent list - 32 of 32 allocation
groups done
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 5
> - agno = 6
> - agno = 4
> - agno = 7
> - agno = 3
So you need to replicate the logic/code you used in phase 3 and put it
into phase4 I presume.
And the changes are in phase4.c/process_ags() (like in phase3.c/process_ags).
So previously it would iterate thru the thread-count and then
prefetch and queue up to the ag-count stepping by the ag-stride.
And now we go thru each AG but only do an ag-stride worth at a time for
a thread. So each thread will get an ag-stride worth of ags to deal with:
Fixed code:
Thread 0 - gets ag 0-3 (go thru a loop doing a fetch & queue work on ag 0-3)
Thread 1 - gets ag 4-7
Thread 2 - gets ag 8-11
etc...
Okay, starting to get it now :)
So previously (broken code) it would do:
Thread 0 - gets ag 0,4,8,12,16,...
Thread 1 - gets ag 1,5,9,13,17,...
Thread 2 - gets ag 2,6,10,14,18,...
I presume the bad phase 4 msgs above are coming out in a different order
than what I said above
just because of the threading. Or did I make a mistake in my reading
of the code.
Pity we couldn't share some code there between the phases but I haven't
looked to see the difficulties with that are - do tell me.
Would be nice to guarantee handling the striding/threading the same in both
phases by
both using the same iterator code - particularly if it needs
to be changed in the future etc. But whatever.
And I presume the fix is so that the prefetching for a thread is done in
ag-order (and they are just given a chunks worth)
instead of the old bad way where the prefetching would be done with a jump
over AGs by the stride instead of in order.
So it looks fine if my understanding is correct.
Cheers,
Tim.
|