On 07/19/2016 09:27 PM, Frank Ch. Eigler wrote:
Which initial work are you referring to?
[ ... ]
If you're referring to commit efc0173ad era stuff from three months
ago, which were merged into 3.11.3, the performance effects are
obvious by inspection, and borne out by QA (e.g., qa/4751) and of
course hand testing. (Concurrent pmNewContext PM_CONTEXT_HOST's no
longer serialized each other when contacting absent hosts.)
As previously stated on this list, that same test case indicated
instances where libpcp locking logic was incorrect, even before the
3.11.3 era changes, resulting in deadlocks. As previously stated,
4751 can hang/crash 3.11.2, 3.11.3, and git master pcp, depending on
invocation. In return, it was called "flakey" and commented out of
qa/group. (I hadn't found an invocation that could hang/crash the
fche/multithread branch libpcp though.)
Yeah, that's what I was referring to. For some reason, I thought that it
had never been merged.
Once we get that bit taken care of, let's tackle the remaining
pieces, one at a time from the same point of view: proposed benefit,
correctness and verification of both.
The post-3.11.3 changes' proposed benefits were entirely the
correction of race conditions and/or elimination of deadlocks. There
was no performance-tweaking argument proposed. Absolute correctness
is tricky to guarantee / verify. Incorrectness is clearer: "valgrind
--tool=helgrind" has been helpful in identifying at least some of the
preexisting problems. It was by that tool's reports that many of the
subsequent small-lock patches were driven.
I suggest starting with a new branch so as to reduce the chance of
errors.
Can you explain what you mean? What would you like to see on this new
branch?
It's my understanding the the subsequent changes came in a series of
smaller changes. If so, rather than trying to pull them in one at a time
from the existing branch(es), it might be easier to re-introduce them
one at a time. If they were not made in a series of small changes, then
perhaps they could still be re-introduced that way. Each addressing the
concerns of
addressing a demonstrable problem
failing (possibly new) qa, or performance issue
reviewable fix
qa now passing or performance improved
The only purpose of breaking it up into smaller bits is to help ease the
concerns over necessity and correctness.
Dave
|