Open|SpeedShop TroubleShooting Guide
First Public Release - November 13, 2005
Two major pieces of 3rd-party software provide much of the
infrastructure for Open|SpeedShop:
DPCL: http://www-124.ibm.com/developerworks/opensource/dpcl
Dyninst: http://www.dyninst.org
The centerpiece of Open|SpeedShop's functionality is its ability to
perform 'dynamic instrumentation' (Dyninst) of a user's target
application on a node-to-node (DPCL) basis across a 'cluster' of
'nodes'. Dyninst provides the dynamic instrumentation ability and DPCL
allows this unique ability to be transported between cluster nodes.
(See webpages cited above for further details). This Guide's purpose is
to assist a Open|SpeedShop newcomer by providing some useful tips to
ease the process of installing, setting up, and using Open|SpeedShop.
The focus is on ensuring proper network connectivity, proper
Open|SpeedShop installation and, finally, proper user setup. All three
of these areas have exact requirements that must be met to be
successful using Open|SpeedShop.
Success in using Open|SpeedShop requires:
- o The extended Internet services daemon 'xinetd' must be
running and properly configured for each 'node'.
Check the '/etc/xinetd.conf' file for correct 'xinetd' configuration.
The webpage http://www.xinetd.org
has pertinent configuration details. If 'xinetd' is not running, then
check the file '/etc/sysconfig/network' to ensure that 'NETWORKING=yes'
is specified and then do the following:
/etc/rc.d/init.d/xinetd start
Example of '/etc/xinetd.conf' file:
#
# Simple configuration file for xinetd
#
# Some defaults, and include /etc/xinetd.d/
defaults
{
instances = 60
log_type = SYSLOG authpriv
log_on_success = HOST PID
log_on_failure = HOST
cps = 25 30
}
includedir /etc/xinetd.d
It is essential that the /etc/xinetd.d directory is included. This is not optional.
- o When using DNS (nameservers), an '/etc/resolv.conf' file
must be present for each 'node'.
This file is not necessary if there is a name server running on the
*local 'node' and the hostname contains the domain name. It is
necessary,however, if the system administrator wants to override the
default ordering of the host lookup services.
Example of '/etc/resolv.conf file:
search foo.bar.com
nameserver 128.162.236.210
nameserver 128.162.237.211
nameserver 137.38.31.248
- o If not using DNS(nameservers), the '/etc/hosts' file must be
present.
This file is a simple text file that associates IP addresses with
hostnames.
Example of an '/etc/hosts' file:
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
128.162.243.146 foo.bar.com foo
- o On some RedHat systems the /etc/hosts.allow file needs
tweaking to get by some security issues.
One user reported that an /etc/hosts.allow entry for the DPCL
SuperDaemon was required, for example:
"dpclSD: 123.456.789.107,127.0.0.1"
Here are some useful background notes, as well, from that user:
1) /etc/hosts.allow is checked to see if a connection can access a port/service.
"ALL: localhost,10.,192.168." would allow any type of connection to be established by
localhost, 10.*.*.*, 192.168.*.* . an entry like "dpclSD: localhost, 128.456.789.121"
would let my desktop machine open a connection to the dpclSD service. My /etc/hosts.allow
file had no "ALL: suff".
2) /etc/hosts.deny is checked if no match is found in /etc/hosts.allow. Typically in
reasonable security settings everything not allowed in /etc/hosts.allow is denied.
3) /etc/hosts.equiv or .rhosts are used to determine which machines can execute remote
shells on a node without entering a password. This is a list of nodes with optional
usernames.
- o Use the 'ping' command to confirm connectivity between all
nodes in a cluster.
If one 'node' can 'ping' another 'node', then basic network
connectivity is in place between those two nodes. For example, if a
user has 'logged in' to 'node1', then the following command issued from
'node1' will test proper network connectivity between 'node1' and,
e.g., 'foo.bar.com':
'ping foo.bar.com'
Example of good 'ping' output:
PING foo.bar.com (128.162.236.165): 56 data bytes
64 bytes from 128.162.236.165: icmp_seq=0 ttl=63 time=1.056 ms
64 bytes from 128.162.236.165: icmp_seq=1 ttl=63 time=1.481 ms
64 bytes from 128.162.236.165: icmp_seq=2 ttl=63 time=1.430 ms
64 bytes from 128.162.236.165: icmp_seq=3 ttl=63 time=1.440 ms
....
Similarly, the 'ftp', 'telnet', and 'rsh' commands can be used to verify network connectivity between nodes.
If 'network unreachable' results from any of these commands, then connectivity issues between the nodes still exist.
- o Routing must be set up properly so network traffic can get
off the 'node'.
In a simple configuration with a single network card in each 'node',
you can define a 'default route' such that all traffic goes thru that
'route'. An IP address is specified as the 'default route'. The command
'/sbin/route' summarizes the current IP routing table for a 'node' and
the command 'route' (man 'route') can be used to manipulate this table.
Example of '/sbin/route' output:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
128.162.243.0 * 255.255.255.0 U 0 0 0 eth0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth0
127.0.0.0 * 255.0.0.0 U 0 0 0 lo
default e-10-13-hs1-243 0.0.0.0 UG 0 0 0 eth0
- o Working network interface must exist for each 'node' with an
assigned IP address.
The command 'ifconfig' (do 'man ifconfig') helps setup this interface
and the command '/sbin/ifconfig' summarizes the current network
interface for each 'node'. See also the webpage: http://www.computerhope.com/unix/uifconfi.htm
for more details regarding 'ifconfig'.
Example of '/sbin/ifconfig' output:
eth0 Link encap:Ethernet HWaddr 08:00:69:13:EB:CB
inet addr:128.162.243.146 Bcast:128.162.243.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1526380 errors:0 dropped:0 overruns:0 frame:0
TX packets:375501 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:205612453 (196.0 Mb) TX bytes:119995458 (114.4 Mb)
Interrupt:57
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:318139 errors:0 dropped:0 overruns:0 frame:0
TX packets:318139 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:42295160 (40.3 Mb) TX bytes:42295160 (40.3 Mb)
- o The Open|SpeedShop software package installs both DPCL and
Dyninst as RPM's.
As such the DPCL daemons are found in /usr/bin, the libraries in
/usr/lib and the include files in /usr/include/dpcl.
- o An identical copy of Open|SpeedShop must be installed on all
nodes in a cluster
where the DPCL client or a target application will run under DPCL
control. Refer to <Open|SpeedShop's installation documentation>.
Once the installation has been done, the user can verify correctness in
a variety of ways. But first, a brief discussion about two major pieces
of DPCL that are installed on a node: the SuperDaemon ('ps' command
will show 'dpclSD' upon activation) and the DPCL CommunicationsDaemon
('ps' command will show 'dpcld' upon activation).
- o A single DPCL SuperDaemon('dpclSD') runs on each 'node' in a
cluster.
When the first SuperDaemon starts as a result of a user's request for
DPCL functionality, it obtains a lock and holds that lock as long as it
is running. Additional DPCL user 'connect' requests to the same 'node'
result in additional DPCL SuperDaemons being created, but these
additional SuperDaemons cannot obtain the lock and transfer the
'connection' to the first DPCL SuperDaemon and exit. The DPCL
SuperDaemon invokes a single DPCL 'CommunicationsDaemon ('dpcld') for
each userid when that user actually utilizes any DPCL functionality.
Part of the DPCL user 'connection' processing is to have shared memory
segments established between the CommunicationsDaemon and a user's
application (target process). Architecture-specific 'locking' code
prevents the CommunicationsDaemon and target processe from
simultaneously updating shared structures.
- o An entry for the DPCL SuperDaemon must exist in
'/etc/services'.
If you are not seeing a line in '/etc/services' for service dpclSD,
then 'xinetd' won't be listening to the SuperDaemon port and DPCL won't
work.
Example of the SuperDaemon '/etc/services' entry:
dpclSD 7895/tcp # DPCL Super Daemon
- o A file named '/etc/xinetd.d/dpclSD' must exist to specify
the DPCL SuperDaemon service via 'xinetd'.
An 'unofficial' tutorial on 'xinetd' and what the various entries in a
'service section' indicate can be found at: http://www.macsecurity.org/resources/xinetd/.
Of particular concern for Open|SpeedShop installation, are the
'server', 'server_args', and 'env' entries. Open|SpeedShop installation
must copy a DPCL SuperDaemon binary('dpclSD') to the location specified
in the 'server' entry. The 'server_args' entry specifies the location
of the CommunicationsDaemon binary('dpcld') which will be 'execd' by
the SuperDaemon with two arguments to specify a default 'Unix name' of
the 'execd' CommunicationsDaemon and a 'default log name' for debug
logfiles created by the CommunicationsDaemon. Finally, the 'env' entry
must update the environment variables 'DPCL_RT_LIB', and
'DYNINST_RT_LIB' to include the path to the directories where the DPCL
and Dyninst runtime libraries are installed.
Example of '/etc/xinetd.d/dpclSD'
service dpclSD
{
disable = no
socket_type = stream
protocol = tcp
wait = no
user = root
server = /usr/bin/dpclSD
server_args = /usr/bin/dpcld /tmp/dpcld /tmp/dpcl
env += DPCL_RT_LIB=/usr/lib/libdpclRT.so.1
env += DYNINSTAPI_RT_LIB=/usr/lib/libdyninstAPI_RT.so.1
}
- o The 'xinetd' daemon must be restarted at some point
following Open|SpeedShop installation.
Doing so allows 'xinetd' to pick up the SuperDaemon(dpclSD) service and
start 'listening' on its port. This restart can be done manually as
'root' user by first doing 'ps -ef | grep xinetd' to get the 'xinetd'
'pid'. The signals for restarting 'xinetd' vary somewhat by operating
system . Some versions require 'kill -USR2 'pid'' where 'pid' is the
pid resulting from the 'ps' command. It is also possible to do either
'kill -9 xinetd' or simply reboot the 'node'. Other Linux system
possibilities are: kill -USR1 `cat /var/run/xinetd.pid`' or 'kill
-SIGHUP `cat /var/run/xinetd.pid`'. When 'xinetd' is restarted, lines
similar to the following should appear in the system file
'/var/log/messages':
.... xinetd[617]: Starting reconfiguration
.... xinetd[617]: readjusting service dpclSD
- o Confirm that 'xinetd' is listening to the DPCL SuperDaemon
port.
The following command can be used:
'netstat -a --inet'
If 'xinetd' is listening to the DPCL SuperDaemon port, a line similar to the following
should result from the 'netstat' command:
tcp 0 0 *:dpclSD *:* LISTEN
The purpose of setting up the 'etc/services' file correctly, restarting 'xinetd', and
so forth, is to ensure the SuperDaemon(dpclSD) service is 'listening' for traffic on its
designated port. Any node where it is intended to run a target program must have been
configured so that the SuperDaemon(dpclSD) service is in the 'listening' state. To further
confirm that the system is listening to the SuperDaemon service correctly, the following
command can be tried (where 7895 is the port number of the SuperDaemon as verified above
in the '/etc/services' file):
telnet foo.bar.com 7895
The 'telnet' command tries to connect to the 7895 port via telnet protocol and 'xinted'
will bring up the SuperDaemon(dpclSD) if it can load the DPCL client libraries, etc. A
simple 'ps' command should show 'dpclSD' running if the 'telnet' command was successful.
If 'netstat' isn't indicating the system is listening on the dpclSD service (or port 7895),
then a "connection refused" status will be given, e.g:
telnet foo.bar.com 7895
Trying 128.162.243.146...
telnet: connect to address 128.162.243.146: Connection refused
Just getting "connection refused", however, indicates, at a minimum, that the network
is properly connected.
Unfortunately, just observing all the above to be correct does not guarantee that the
'xinetd' is working properly. It has been observed that flaky hardware 'router' problems
will cause things to not work, even though 'xinetd' is listening and has been restarted.
Sometimes the only solution is a reboot of the entire system.
- o A '.rhosts' file must exist in the user's home directory
with a line for each host/user
the user wants to allow to connect to the user's node. The file also
needs to be owned by the user with permissions 0644. This can be
verified by trying a 'rsh' command, assuming 'rsh' daemons are up and
running on the cluster 'nodes' in question..
Example of '.rhosts' file:
foo.bar.com userlogin
foo1.bar.com userlogin
foo2.bar.com userlogin
- o The user must set certain environment variables for
successful target compilation and execution
using Open|SpeedShop. These environement variables point to necessary
loadtime/runtime/etc libraries used by the two largest underlying
components of Open|SpeedShop: Dyninst and DPCL (see 'INTRODUCTION.
above). The user's '.login' file can do the following command:
- o The same 'userid' must be invoked at 'login' time on each
'node' in the cluster
where Open|SpeedShop will be utilized, locally or remotely from another
'node'. Otherwise a DPCL error will result (see 'Common Errors' section
below). What this means, for example, is that a user expecting to
'login' on 'node1' as 'fred' and use Open|SpeedShop to dynamically
instrument the user's target application 'foo' on 'node2', must 'login'
to 'node2' as 'fred' as well.
- o A user cannot run as 'superuser' or 'root' user. This is a
security precaution by the DPCL component of Open|SpeedShop.
- o Proper compilation and linkage of a user's target
application has specific requirements
if Open|SpeedShop will be 'connecting' to it. A sample 'Makefile' for
64-bit Linux C/C++ (below) illustrates the need for accessing the
required include files, libraries, predefines, and so forth.
Example Makefile for 64-bit Linux C/C++ :
--begin Makefile
.SUFFIXES: .C
code=sample
INCDIR = /usr/include/dpcl
INC = -I
LIBLOC = -L /usr/lib
LIB = -ldpcl -lelf
CCFLAGS = -pg -g -gdwarf_2 -D__64BIT__
all: ${code}
.C.o:
-c $(<)
${code}: ${code}.o
-o ${code} ${code}.o
clean:
/bin/rm -rf ${code} .o
--end Makefile
- o Any leftover DPCL SuperDaemons or CommunicationsDaemons from
previous executions must be killed.
Orderly termination of DPCL daemons can be problematic at times due to
the many variables involved. Failure to get rid of orphaned copies of
the SuperDaemon/CommunicationsDaemon may result in a 'hang' condition,
whereby a DPCL 'mutator', upon initial startup, just seems to be hung
and not accumulating time. This is because 'xinetd' has disallowed any
additional SuperDaemon/CommunicationsDaemon combinations to be started.
The 'xinetd' service associated with the SuperDaemon has reached a
'xinted' resource limit. The following two successive commands can be
used to, first, see if any undesirable leftover DPCL processes exist
prior to beginning a new Open|SpeedShop work session and, secondly, to
remove them:
'ps -ef | grep dpcl*'
'pkill -9 dpcl'
- o The proper version of library 'libelf' is very important.
The basic problem seems to be that there are two implementations of
libelf out there. There is the one in http://directory.fsf.org/libs/misc/libelf.html,
which is known to work (use 8.5 normally). There is a ".94" version
which simple does not work with DPCL for unknown reasons. This bad
version seems to be present primarily on the RedHat Linux systems
(including the Fedora Core stuff). Using a bad version of libelf is
known to cause DPCL to throw a C++ exception when building probe
expressions. Everywhere that "libelf.h" is included should be examined
to ensure the bad (".94") version is not somehow chosen by system
default. Of course the -I compile options dictate the location of
include files and need to be examined closely.
The two major 3rd party software pieces that form the infrastructure
for Open|SpeedShop (see 'OVERVIEW' above) are able to detect, and allow
the user to recover from, a number of error situations when dealing
with the complex problem of transporting dynamic instrumentation across
several nodes in a cluster. These 'recoverable' errors can often be
overcome with relative ease by the user and they are listed below. This
same software, however, can sometimes unfortunately continue to the
point of causing (e.g.) a SIGSTOP or SIGSEGV signal and recovery from
these errors can be difficult. These 'other' errors sometimes require
formal fixes or additions to the software and, accordingly, they are
discussed in general terms only.
- o ASC_daisd_no_info_fserv
There is an obvious problem with setting up the DPCL SuperDaemon
service. For example, no service entry for 'dpclSD' (the SuperDaemon)
exists in file (/etc/services). Make sure each item listed above under
'Proper Network Installation for Open|SpeedShop' is satisfied.
- o ASC_exec_failed
This error occurs when the dplcSD (SuperDaemon) as specified in the
/etc/xinetd.d/dpclSD service entry is trying to execute ('exec') a
'server arg' dpcld (CommDaemon) that does not exist, has insufficient
permissions, etc. Either change the 'server arg' or make sure that the
executable named in 'server arg' exists and has correct
permission/owner information.
- o ASC_install_failed
This error occurs when a DPCL instrumentation probe cannot be
installed. Often this is due to running out of DPCL's shared memory
because each instrumentation probe requires a shared memory 'message
handle'. The value of AIS_SHM_SIZE determines how much DPCL shared
memory is allocated. A recompile of DPCL with a larger value for
AIS_SHM_SIZE might be required. Long term thinking is to make
determination of AIS_SHM_SIZE a runtime function. But for now it is a
fixed constant set up in ~daemon/src/ShmUsage.C.
- o ASC_invalid_version
This error occurs when running between nodes in a cluster. It says that
the DPCL version you are running with on one node does not match the
DPCL version running on the other node. See section titled 'Proper
Network Open|SpeedShop Installation' where it is stated that an
'identical copy of Open|SpeedShop must be installed on all nodes' in a
cluster.
- o ASC_unknown_status.
This is a somewhat ambiguous error message that can often be associated
with network 'connectivity' issues. See the section on 'Proper Network
Connectivity' under the (USING Open|SpeedShop) heading and confirm that
your network is properly connected between the 'nodes' in question.
- ASC_communication_failure:
This error often means either the DPCL SuperDaemon or the
CommunicationsDaemon crashed or were not able to begin execution. If
the SuperDaemon crashes, it will close the 'socket' for its connection
to the DPCL library code on the 'client' side and this 'client' code
will issue the error. The 'ldd' command can be used to verify that both
daemons are finding all the necessary dynamic libraries necessary for
them to run. Assuming that '/usr/bin' contains a binary for both the
SuperDaemon and the Communications daemon, this command can be tried as
follows:
'ldd /usr/bin/dpcld', or
'ldd /usr/bin/dpclSD'
If any dynamic library appears as 'unknown', then that would cause the above error to appear. Otherwise a 'non-recoverable'
error (see next section) could be the source of one of the DPCL daemons crashing. What's supposed to happen is that the
SuperDaemon either creates a DPCL CommunicationsDaemon process for a userid or finds an existing CommunicationsDaemon
for that userid and uses it. Messaging then passes an established socket connection between the client (Open|SpeedShop) and
the DPCL SuperDaemon. A 'broken pipe' status in conjunction with this error would indicate the DPCL CommunicationsDaemon
process has somehow disappeared, inplicitly closing its side of the pipe.
Another source of the 'ASC_communication_failure' error occurs if copies of the 'mutatee' have been orphaned and
still exist. For example, assume the binary you are instrumenting has the name 'fred' (i.e. the 'mutatee' is
named 'fred'). Following a run of the DPCL mutator which instruments 'fred', there may be copies of 'fred'
still visible via 'ps -ef | grep fred'. To clean things up, a 'pkill -9 fred' can be done. This might
very well eliminate the 'ASC_communication_failure'.
Still another source of the 'ASC_communication_failure' error can occur because old DPCL shared memory has
not been released. DPCL uses shared memory in conjunction with installing instrumentation probes.
Enter the command 'ipcs'. This will yield something like:
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x4103b332 1703936 slc 600 268435456 1
If any old shared memory segments are still out there for your userid, for example in this case 'slc' has old memory under
shmid 1703936 and it is a sizeable chunk of memory. It must be removed via the command 'ipcrm -m 1703936'.
Still another source of the 'ASC_communication_failure' error occurs during 'connection' processing when the '.rhosts'
file is not properly set up. See the Proper Network User Login and Environment section above. A command such
as "grep dpcl /var/log/messages" might expose possible problems with the .rhosts /etc/hosts.* stuff.
Still another source of the 'ASC_communication_failure' error occurs when trying to create a 'mutatee'. If the path
to the 'mutatee' is incorrect, or if the arguments to the 'mutatee' are incorrect (try executing the 'mutatee' with the path
and arguments you know you are using), then this error will appear.
- o ASC_rhost_failed
Almost certainly this is an indication that the user's '.rhosts' file
either does not exist on the 'node' in question or its contents do not
reflect the desired connectivity. See the Proper Network User Login and
Environment section above regarding the contents of '.rhosts'.
- o ASC_daisd_cant_sock
This error message firmly indicates that the network has connectivity
issues, most likely that the '/etc/services' file is improperly
configured and the DPCL SuperDaemon service is not functional. See
(USING Open|SpeedShop) above and try the suggestions in the Network
Connectivity section, one at a time. Also, DPCL uses the Linux
'ruserok' security mechanism and if the call to 'ruserok' fails, then
this error might appear. It would indicate a problem with the user's
'.rhosts' file or the '/etc/hosts.equiv' file. See the *man page for
'ruserok'. documented in the man page.
- o ASC_invalid_pid
Somehow DPCL has been asked to attach to a 'pid' on a 'node' that has
no such process 'pid' active. Check that the 'pid' entered via
Open|SpeedShop is actually that of the target application running on
the desired 'node'. Also, the target application must be running under
the same (non-root) userid as that under which Open|SpeedShop was
invoked.
- o ASC_missing_predef_func
DPCL has certain functions it needs to find in the library
'libdpclRT.so'. These functions are: 'shm_attach', 'shm_detach',
'Ais_phase_add', 'Ais_phase_deactivate', 'Ais_phase_set_period',
'Ais_phase_remove'. Somehow DPCL is not finding 'libdpclRT.so' via the
environment variable LD_LIBRARY_PATH.
- o ASC_failure
This error can have several points-of-origin.
The improper setting of the environment variables DPCL_RT_LIB, and DYNINSTAPI_RT_LIB in the file '/etc/xinetd.d/dpclSD'. See the section on Proper Network
Installation of Open|Speedshop above for an example of a proper '/etc/xinetd.d/dpclSD' file.
- o Runtime libraries
In order to implement dynamic instrumentation across nodes in a
cluster, the user's target application is significantly 'mutated' or
changed. First, it must have a couple runtime libraries brought into
its address space. One such library exists for DPCL and one for
Dyninst. Within Dyninst, there are places where the software simply
generates an 'abort' because the circumstances are intractable and the
software knows it has a problem. In other cases, Dyninst will
unwittingly keep going with, for example, an improper index which
eventually results in a memory access error. These 'other errors' in
the Dyninst runtime library will manifest themselves in a couple
different ways. First, the target application will simply die with
(e.g.) a SIGSTOP or SIGSEGV signal. In addition, the DPCL layer of
software will detect that the target application has died and the DPCL
CommunicationsDaemon('dpcld') will shut down, resulting in (e.g.) an
'ASC_communiction_error' being delivered to the DPCL client portion of
Open|SpeedShop. For the most part, the DPCL runtime library itself does
not 'abort' when it recognizes an intractable situation, but rather
'messages' to the SuperDaemon('dpclSD'), the non-runtime portion of the
CommunicationsDaemon('dpcld'), and the DPCL client portion of
Open|SpeedShop that things need to be shut down. In this more-or-less
orderly 'messaging' case, the user could see the CommunicationsDaemon
shut down and the target application simply continuing on its way after
off-loading all its 'mutated' portions. However, if a portion of DPCL
unwittingly continues with (e.g.) an improper index, then an abort
signal (e.g.) will be delivered, possibly 'hanging' the target
application or causing some other undesirable result. The combinatorics
are complex and, accordingly, the resulting error recovery can take
many forms, almost all of which can be troublesome.
- o DPCL CommunicationsDaemon('dpcld')
This daemon has a couple components, one for Dyninst and one for DPCL
itself. Similar to its 'runtime library' portion, the Dyninst
non-runtime software within the DPCL CommunicationsDaemon can
gracefully 'abort' if it knows it has a problem, or issue (e.g.) a
SIGSEGV signal unwittingly. These Dyninst errors will likely cause the
daemon to cease running with appropriate 'messages' to the
Open|SpeedShop client and the DPCL SuperDaemon(dpclSD). The user's
target application might continue running after off-loading its
'mutated' portions, or it might simply 'hang' due to some unknown
combination of things, but the target application itself will not die.
- o Client-side
The 'client' portion of Open|SpeedShop has many pieces as well,
including the 'client' portion of DPCL, the Open|SpeedShop 'framework',
the Open|SpeedShop 'gui', and so forth. When this part of the software
gets into an error situation with (e.g.) an improper index, then the
user will likely see an error message displayed by Open|SpeedShop
itself. In addition, the DPCL CommunicationsDaemon will detect that its
client has problems and begin to shut down its side of things. Again,
the combinatorics are complex and, accordingly, so is the error
recovery on the client side of things.
Generated on Wed Sep 28 11:18:29 2005 for Open
Speed Shop by
1.3.6