pcp
[Top] [All Lists]

Re: [pcp] errors from socket code on Mac OS X

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>, Dave Brolley <brolley@xxxxxxxxxx>
Subject: Re: [pcp] errors from socket code on Mac OS X
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Tue, 5 Jul 2016 23:05:42 -0400 (EDT)
Cc: PCP <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <577C1D0A.6040300@xxxxxxxxxx>
References: <577C1045.1040108@xxxxxxxxxxxxxxxx> <577C1D0A.6040300@xxxxxxxxxx>
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: 9k7IpXiu582Q9Y/FaOwru0cGjWw2bQ==
Thread-topic: errors from socket code on Mac OS X
Hi guys,

----- Original Message -----
> On 07/05/2016 03:53 PM, Ken McDonell wrote:
> > I'm seeing this ...
> >
> > [DATE] pmcd(PID) Error: auxconnect.c:__pmSockAddrInit: Invalid address
> > family: 0
> > [DATE] pmcd(PID) Error: auxconnect.c:__pmSockAddrCompare: Invalid address
> > family: 0
> >
> > in about half the failing tests on Mac OS X.
> >
> > Does anyone know how or why we'd be traversing the libpcp socket code for
> > an AF of 0?
> >
> > Seems like a missing or broken guard somewhere higher up the call stack,
> > but I have not been able to diagnose this, so I'm seeking assistance from
> > those who know more.
> >
> > Thanks for any hints or suggestions.
> >
> This sequence of errors suggests to me that __pmSockAddrIsLoopBack() is
> being called with an address containing family==0.
> 
>   __pmSockAddrIsLoopBack first extracts the family from the given
> address and the calls __pmLoopBackAddress(family), which in turn calls
> __pmSockAddrInit() using that family (first error).
> 
>   It then calls __pmSockAddrCompare() with the original address and the
> manufactured loopback address, both of which will now have family==0
> (second error).
> 
> Possible candidates:
> __pmAccAddClient(new client adress)
>    __pmSockAddrIsLoopBack(const __pmSockAddr *addr)
> 
> HandleClientInput(__pmFdSet *fdsPtr)
>    DoCreds(addr from client table)
>      __pmSockAddrIsLoopBack(const __pmSockAddr *addr)
> 
> VerifyClient(addr from client table)
>    __pmSockAddrIsLoopBack(const __pmSockAddr *addr)
> 
> I hope this helps,

Ayup, certainly did.

I poked at this a bit today (hmm, "lldb" now eh? fun) ... and I think it may
be that accept is not filling in the family Dave.  In your list above Dave I
was able to reproduce it from __pmAccAddClient.  We have this libpcp code:

void
__pmCheckAcceptedAddress(__pmSockAddr *addr)
{
#if defined(HAVE_STRUCT_SOCKADDR_UN)
    /*
     * accept(3) doesn't set the peer address for unix domain sockets.
     * We need to do it ourselves. The address family
     * is set, so we can use it to test. There is only one unix domain socket
     * open, so we know its path.
     */
    if (__pmSockAddrGetFamily(addr) == AF_UNIX)
        __pmSockAddrSetPath(addr, localSocketPath);
#endif
}

(via __pmAccept)

... and it looks like we are seeing a sockaddr that is (still) completely
zeroed after we accept on the fd in pmcd/client.c AcceptNewClient.  The
attached patch seems to tidy it up for me ... whaddya think Dave?  Are we
likely to see other places where this happens, I wonder?

cheers.

--
Nathan

Attachment: mac.patch
Description: Text Data

<Prev in Thread] Current Thread [Next in Thread>