pcp
[Top] [All Lists]

Re: Regression in qa/628

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: Regression in qa/628
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri, 12 Dec 2014 07:34:22 +1100
Cc: pcp <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <1641491119.9465653.1417602588298.JavaMail.zimbra@xxxxxxxxxx>
References: <1224656795.6945862.1417136806064.JavaMail.zimbra@xxxxxxxxxx> <8149026.6994945.1417145442845.JavaMail.zimbra@xxxxxxxxxx> <6c1905a6-237d-4181-b2f5-d436d42f66b8.maildroid@localhost> <1867879926.7608473.1417393457721.JavaMail.zimbra@xxxxxxxxxx> <1641491119.9465653.1417602588298.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
On 03/12/14 21:29, Nathan Scott wrote:
> ..
> I've reverted the revert in dev, but the above statements are all still true -
> not really understanding how, but some kind of ABI-breaking oddity seems to be
> in play here.  There's a comment in the pmda libdefs.h about how that 
> structure
> is used as "private per PMDA data when multiple DSO PMDAs are in use" - do you
> know the background to that?  That is our situation here, so that aspect might
> be worth a deeper look.

Done some more digging here.

The change is to e_ext_t defined in libdefs.h, namely int pmda_interface; 
replaced by pmdaInterface *dispatch;.  For 32-bit machines, this would not 
change the size of e_ext_t nor the offset of any of the other elements in 
e_ext_t.  For other machines, the e_ext_t size _might_ change, but gcc on 
x86_64 the way we use it (default alignment rules in play) will pad the old 
struct so they are exactly the same size and hence the offsets for the other 
elements in e_ext_t are unchanged.

I've checked all this with simple C code.

So we're down to understanding any cases in which an existing binary (not part 
of the PCP package, e.g. an old version of a PMDA that is built from source 
during QA) could believe the interpret the pmdaInterface *dispatch value as a 
int pmda_interface, or vice versa.  But libdefs.h is private to libpcp_pmda and 
is not included in any source file other than those in libpcp_pmda and 
libdefs.h is not shipped.  And all the instances of e_ext_t are calloc'd in 
open.c within libpcp_pmda.

So I can't see any way that any executable (old or new) can be using e_ext_t 
outside of libpcp_pmda.

Now as to the comment in the header ... when multiple DSO PMDA's are in use we 
need some private-per-PMDA place to hold the skeletal pmResult (biggest so far, 
to avoid malloc/free thrashing) and the hash table of pmids that have been 
requested.  So I don't think these have any intersection with the changes I 
made to e_ext_t.

And finally on to qa/628 ... this one _rebuilds_ the PMDA from source so seems 
to be immune from any e_ext_t changes.

But I've just seen a failure!!

Nathan does this match the failure signature you saw?

--- 628.out     2013-11-04 12:16:16.077770453 +1100
+++ 628.out.bad 2014-12-12 07:30:21.297771237 +1100
@@ -1,7 +1,7 @@
 QA output created by 628
 
 simple.numfetch PMID: 253.0.0
-    value 3
+No value(s) available!
 
 idiot.numfetch PMID: 177.0.0
     value 3
@@ -13,7 +13,7 @@
     value 5
 
 simple.numfetch PMID: 253.0.0
-    value 4
+No value(s) available!
 
 idiot.numfetch PMID: 177.0.0
     value 6

So there is more to this story ... off to debug the qa test.

<Prev in Thread] Current Thread [Next in Thread>