[started as a private reply to Nathan, but wandered into seeking more
information from Michelle, so Cc'ing]
One thing I noticed in looking at this code is that we pass a
sockaddr_in to accept() (everywhere we use accept(), not just in
pmlogger) rather than a sockaddr ... this works on the Linux and Unix
derivatives because I think (I have not checked) that the socket is
always AF_INET ... maybe this does not work all the time on Windows?
But more importantly, we will _only_ execute the accept() in pmlogger if
someone connects on pmlogger's control port.
It might be worth checking with Michelle where the connection is coming
from ... if she's not running pmlc manually, then the only candidates I
can think of are pmlogger_check or pmnewlog ... do either of these run
on Windows by default (without cron that would seem unlikely)?
Michelle says pmlogger lasts 20-25 seconds, so I wonder who's running
pmlc?
And if it is not pmlc, is it possible that there is a port clash, and
something outside PCP is making the connection, possibly using .NET or
IPv6 or goodness knows what protocol?
Michelle, setting $PMLOGGER_PORT in the environment to have a value
other than 4330 then running pmlogger would provide some evidence for
the port clash hypothesis.
Is Michelle on IRC and anywhere near our civilized timezones ... if so
we might progress this faster there, rather than in email.
On Fri, 2011-07-29 at 08:40 +1000, Nathan Scott wrote:
> Hi Michelle,
>
> ----- Original Message -----
> > I am emailing you from my work account now. Here are the details
> > running with gdb. If you still need the log file with the -dpdu switch
> > let me know. I added in some print statements and rebuilt the code
> > again in a different test, and it appeared the accept was causing the
> > error.
> >
>
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x773c9b60 in msvcrt!memcpy () from C:\Windows\system32\msvcrt.dll
> > (gdb) warning: Invalid parameter passed to C runtime function.
> >
> > bt
> > #0 0x773c9b60 in msvcrt!memcpy () from C:\Windows\system32\msvcrt.dll
> > #1 0x751dbb14 in MigrateWinsockConfiguration ()
> > from C:\Windows\System32\mswsock.dll
> > #2 0x777ae6ed in WSAEnumNameSpaceProvidersExA ()
> > from C:\Windows\system32\ws2_32.dll
> > #3 0x777ae662 in WS2_32!FreeAddrInfoEx () from
> > C:\Windows\system32\ws2_32.dll
> > #4 0x00406a7f in control_req () at ports.c:408
> > #5 0x00403838 in main (argc=6, argv=0x5c0f40) at pmlogger.c:876
> > (gdb)
>
> Hmmm. So, is line 408 in the pmlogger/ports.c source you're building
> the accept() call? (in latest pcp git tree, line 408 is the error
> handling on the next line).
>
> If its the accept, it looks like somethings gone horribly wrong down
> in the network stack - not clear what "MigrateWinsockConfiguration"
> does, looks deep in the Windows socket code. Might be worth checking
> whether there are any Microsoft updates for that platform.
>
> Not sure this is something we're going to be able to fix. :( If you
> are desparate for a workaround you could modify the code to return 0
> from control_req() - that functionality is rarely used (its the pmlc /
> pmlogger control mechanism). But, better to hunt down Windows socket
> fixes I guess ... or try figure out what has to be "migrated" in your
> Windows socket configuration???
>
> >
> >
> > Here is the log file log2.txt:
>
> Hmm ... lots of IRIX metrics there ... where did that log config file
> come from? If it was generated, the tool doing that might be bit out
> of date.
>
> > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > Log for pmlogger on win7-o2k3 started Thu Jul 28 07:31:48 2011
> >
> > Warning [c:/MSYS-installed/config/pmlogger/simple2.base.base, line 20]
> > Problem with lookup for metric "hw.hub.ii.cb_errors" ... logging not
> > activated
> > Reason: Unknown metric name
> > Warning [c:/MSYS-installed/config/pmlogger/simple2.base.base, line 21]
> > Problem with lookup for metric "hw.hub.ni.cb_errors" ... logging not
> > activated
> > Reason: Unknown metric name
> > Warning [c:/MSYS-installed/config/pmlogger/simple2.base.base, line 22]
> > Problem with lookup for metric "hw.router.perport.cb_errors" ...
> > logging not activated
> > Reason: Unknown metric name
> > Warning [c:/MSYS-installed/config/pmlogger/simple2.base.base, line 26]
> > Problem with lookup for metric "disk.all.active" ... logging not
> > activated
> > Reason: Unknown metric name
> > Warning [c:/MSYS-installed/config/pmlogger/simple2.base.base, line 27]
> > Problem with lookup for metric "disk.all.avg_disk.active" ... logging
> > not activated
> ...
>
>
> cheers.
>
|