pcp
[Top] [All Lists]

Re: [pcp] PCP Updates: pmlogger AF_UNIX socket for normal users; qa vers

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] PCP Updates: pmlogger AF_UNIX socket for normal users; qa version check bump
From: Dave Brolley <brolley@xxxxxxxxxx>
Date: Wed, 05 Mar 2014 12:11:08 -0500
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <1734063835.17483667.1393481715436.JavaMail.zimbra@xxxxxxxxxx>
References: <53075D46.6090807@xxxxxxxxxx> <1734063835.17483667.1393481715436.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
On 02/27/2014 01:15 AM, Nathan Scott wrote:
diff --git a/src/pmlogconf/pmlogconf.sh b/src/pmlogconf/pmlogconf.sh
index db98da8..7e10e5d 100755
--- a/src/pmlogconf/pmlogconf.sh
+++ b/src/pmlogconf/pmlogconf.sh
@@ -11,6 +11,7 @@
 #		when the group was added to the configuration file
 #	delta	delta argument for pmlogger "logging ... on delta" clause
 #
+# Copyright (c) 2014 Red Hat.
 # Copyright (c) 1998,2003 Silicon Graphics, Inc.  All Rights Reserved.
 # 
 # This program is free software; you can redistribute it and/or modify it
@@ -485,8 +486,9 @@ End-of-File
 #
 
 [access]
-disallow * : all;
-allow localhost : enquire;
+disallow .* : all;
+disallow :* : all;
+allow local:* : enquire;
 End-of-File
Hmmm.  When I run QA on this code, after an upgrade (so, without
the above change), test 023 hangs.  I wonder if the above change
is being assumed to be in place...?

OK. I've tracked this down to the following code in _check_logger within pmlogger_check.sh:

ÂÂÂ # wait until pmlogger process starts, or exits
ÂÂÂ #
ÂÂÂ delay=5
ÂÂÂ [ ! -z "$PMCD_CONNECT_TIMEOUT" ] && delay=$PMCD_CONNECT_TIMEOUT
ÂÂÂ x=5
ÂÂÂ [ ! -z "$PMCD_REQUEST_TIMEOUT" ] && x=$PMCD_REQUEST_TIMEOUT

ÂÂÂ # wait for maximum time of a connection and 20 requests
ÂÂÂ #
ÂÂÂ delay=`expr \( $delay + 20 \* $x \) \* 10`ÂÂÂ # tenths of a second
ÂÂÂ while [ $delay -gt 0 ]
ÂÂÂ do
ÂÂÂ if [ -f $logfile ]
ÂÂÂ then
ÂÂÂ ÂÂÂ # $logfile was previously removed, if it has appeared again
ÂÂÂ ÂÂÂ # then we know pmlogger has started ... if not just sleep and
ÂÂÂ ÂÂÂ # try again
ÂÂÂ ÂÂÂ #
ÂÂÂ ÂÂÂ if echo "connect $1" | pmlc 2>&1 | grep "Unable to connect" >/dev/null
ÂÂÂ ÂÂÂ then
ÂÂÂ ÂÂÂ :
ÂÂÂ ÂÂÂ else
ÂÂÂ ÂÂÂ $VERBOSE && echo " done"
ÂÂÂ ÂÂÂ return 0
ÂÂÂ ÂÂÂ fi

ÂÂÂ [ .... ]

ÂÂÂ pmsleep 0.1
ÂÂÂ delay=`expr $delay - 1`
ÂÂÂ $VERBOSE && [ `expr $delay % 10` -eq 0 ] && \
ÂÂÂ ÂÂÂ ÂÂÂ $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C"
ÂÂÂ done

This code tries to make sure that pmlogger is running by attempting to connect using pmlc. Without the updated access controls, pmlogger correctly rejects each connection attempt and the loop logic does work as intended, decrementing $delay toward zero. The problem is that $delay gets set to 25100, and each connection attempt takes about 1 second. As a result we wait for a looooong time, making repeated failed connection attempts.

A few observations:
  • Even if $delay is intended to 10ths of a second, as the pmsleep implies, 2510 seconds is still a long time.
  • The comment says that a max of 20 requests will be made, but I can't figure out how that is represented by $delay. This could probably be better represented by a connection counter.
  • $delay for the loop was set this high because the original $delay and $x were set to 150 and 120 respectively. i.e. it looks like the $PMCD_* defaults kicked in. Is this as intended?
  • For this case where the response is "Unable to connect: ... Connection refused", the loop should exit immediately.
Dave
<Prev in Thread] Current Thread [Next in Thread>