pcp
[Top] [All Lists]

Re: libpcp multithreading - next steps

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: libpcp multithreading - next steps
From: fche@xxxxxxxxxx (Frank Ch. Eigler)
Date: Tue, 26 Jul 2016 17:09:14 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <71790d7d-377e-c28e-0adf-57fb221c3539@xxxxxxxxxxxxxxxx> (Ken McDonell's message of "Wed, 27 Jul 2016 06:49:45 +1000")
References: <20160603155039.GB26460@xxxxxxxxxx> <578D1AE1.6060307@xxxxxxxxxx> <y0my44xksjb.fsf@xxxxxxxx> <57965C89.40401@xxxxxxxxxx> <20160725203257.GG5274@xxxxxxxxxx> <5797B9F7.2020701@xxxxxxxxxx> <71790d7d-377e-c28e-0adf-57fb221c3539@xxxxxxxxxxxxxxxx>
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)
kenj wrote:

> [...]  This commit contains the contentious (in my mind)
> __pmHandleToPtr_unlocked() change ... reviewing this pre-empts the
> discussion about how to fix the context create/destroy race that I
> wanted us to resolve as a first step of restating this effort.

Sure.  Let's resolve it!


> I also am unconvinced that any lock inversion existed in the
> original design and implementation

What are you referring to as "original"?


> and would like to understand that (if it exists) before we embark on
> a development path that relies on helgrind to find lock inversions
> rather than design to avoid lock inversions.

It sounds as though you are suspicious that helgrind is unreliable:
that the lock inversion errors are mistaken.  Let me assure you that
every case I've studied, it was genuine.

Whether each report represents a design flaw vs. an implementation bug
is a separate question.  It's not hard to see intuitively the basic
design level cause of lock ordering bugs: use of the libpcp lock AND
per-context locks (AND some others).  There is no standard (and
definitely no formal assertion/checking) as to which type of lock must
or must not be held upon entry to which lower level libpcp functions,
so they often aggressively take locks of whatever type they want.
With libpcp being recursive, they can often get away with this.
... but scale up, race, and boom.


- FChE

<Prev in Thread] Current Thread [Next in Thread>