pcp
[Top] [All Lists]

Re: [pcp] pcp updates: multithreading etc.

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>, Dave Brolley <brolley@xxxxxxxxxx>
Subject: Re: [pcp] pcp updates: multithreading etc.
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Fri, 6 May 2016 10:27:51 +1000
Cc: pcp developers <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <20160505153559.GA18023@xxxxxxxxxx>
References: <20160502172853.GL24878@xxxxxxxxxx> <572B5FA6.3080802@xxxxxxxxxx> <20160505153559.GA18023@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2
On 06/05/16 01:35, Frank Ch. Eigler wrote:
Hi -

I gave this a quick review. The idea of separate locks for individual
shared data objects looks ok to me. However, qa test 449 is hanging for me.

This appears improved with this patch on fche/multithread enough that
I can't make it fail here after thousands of runs.


commit d664a9d82aad11d64f8e3948fe1d51a7359ec3da
Author: Frank Ch. Eigler <fche@xxxxxxxxxx>
Date:   Thu May 5 11:24:44 2016 -0400

I cherry-picked this commit, but qa/449 still hangs.

I thought I might have missed a commit from Frank's branch, so I did a git-pull from there, to be sure, to be sure ... nothing obvious was pulled and qa/449 still hangs.

So there must be some other difference in the source trees or compilers or platforms or phase of the moon.

The failure is in multithread1 ...

first thread

#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f59e018ed82 in __GI___pthread_mutex_lock (
    mutex=mutex@entry=0x7f59e017f460 <__pmLock_libpcp>)
    at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f59dff53c43 in __pmLock (
    lock=lock@entry=0x7f59e017f460 <__pmLock_libpcp>,
file=file@entry=0x7f59dff65630 "pmns.c", line=line@entry=928) at lock.c:278
#3  0x00007f59dff23356 in __pmFixPMNSHashTab (tree=0x7f59d8003620,
    numpmid=<optimised out>, dupok=dupok@entry=1) at pmns.c:928
#4  0x00007f59dff238c2 in pass2 (dupok=1) at pmns.c:806
#5  loadascii (use_cpp=<optimised out>, dupok=1) at pmns.c:1190
#6  load (filename=filename@entry=0x0, dupok=dupok@entry=1,
    use_cpp=<optimised out>, use_cpp@entry=0) at pmns.c:1374
#7  0x00007f59dff23ef8 in LoadDefault (reason_msg=0x7f59dff656df "local",
    use_cpp=0) at pmns.c:176
#8  pmGetPMNSLocation () at pmns.c:240
#9  0x00007f59dff24618 in GetLocation () at pmns.c:304
#10 pmLookupName (numpmid=numpmid@entry=1,
    namelist=namelist@entry=0x602110 <namelist>,
    pmidlist=pmidlist@entry=0x602174 <pmidlist>) at pmns.c:1514
#11 0x000000000040105d in func () at multithread1.c:58
#12 0x000000000040158c in func1 (arg=<optimised out>) at multithread1.c:147
#13 0x00007f59e018c6aa in start_thread (arg=0x7f59de1e6700)

second thread

#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f59e018ed82 in __GI___pthread_mutex_lock (
    mutex=mutex@entry=0x7f59d80008e0) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f59dff53c43 in __pmLock (lock=lock@entry=0x7f59d80008e0,
    file=file@entry=0x7f59dff62df4 "context.c", line=line@entry=114)
    at lock.c:278
#3  0x00007f59dff1515b in __pmHandleToPtr (handle=handle@entry=0)
    at context.c:114
#4  0x00007f59dff23c09 in pmGetPMNSLocation () at pmns.c:208
#5  0x00007f59dff24618 in GetLocation () at pmns.c:304
#6  pmLookupName (numpmid=numpmid@entry=1,
    namelist=namelist@entry=0x602110 <namelist>,
    pmidlist=pmidlist@entry=0x602174 <pmidlist>) at pmns.c:1514
#7  0x000000000040105d in func () at multithread1.c:58
#8  0x000000000040150c in func2 (arg=<optimised out>) at multithread1.c:172
#9  0x00007f59e018c6aa in start_thread (arg=0x7f59dd9e5700)
    at pthread_create.c:333
#10 0x00007f59dfc3ee9d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

I'm afraid I don't have time to debug this further at the moment.

<Prev in Thread] Current Thread [Next in Thread>