failsafe
[Top] [All Lists]

more install notes under RH6.2

To: failsafe@xxxxxxxxxxx
Subject: more install notes under RH6.2
From: "Eric Z. Ayers" <eric@xxxxxxxxxxx>
Date: Fri, 28 Jul 2000 13:29:05 -0400 (EDT)
In-reply-to: <14721.38852.621813.613105@gargle.gargle.HOWL>
References: <14721.38852.621813.613105@gargle.gargle.HOWL>
Reply-to: Eric.Ayers@xxxxxxxxxxx
Sender: owner-failsafe@xxxxxxxxxxx
Another permissions problem I ran into when trying to test Network
Connectivity: 

$ cd /usr/lib/sysadm/privbin;
$ ls -l ClusterDiags
-rwxrwxr-x    1 root     root           79 Jan  2 00:54 ClusterDiags

(the write bit needs to be off of this executable)

Once I fixed that, it tells me that:

Cluster Diagnostics have not been implemented in this release.

------------------------------------------------------------------------------

>From the GUI, I added my nodes to my cluster definition, and then
chose:

Fix or Upgrade Cluster Nodes 
  --> Start Failsafe HA Services
   --> Start

It only started the first node, not the second.  (No, I didn't select the
node from the optional drop down list) I the went back and
start it specifically on the second node (this time, I did select the
node from the drop down list), and then the gui updated the
status of the second node from 'Inactive' to 'OK'.  No big deal, I
just wondered why it didn't come up on both nodes the first time.

I attached a portion of /var/log/messages below. There are some ugly
looking messages about: 

 Stale CDB handle.
  CI_IPCERR_NOSERVER, cms ipc: ipcclnt_connect() failed, file 
/var/cluster/ha/comm/cmsd-ipc_dru1a .Check if the cmsd daemon is running. 

from the time I attempted to start the cluster.

------------------------------------------------------------------------------

OK, now I'm at the point where I want to define my resources.  

>From FailSafe Manager:
  --> Resources & Resource Types
   --> Define a New Resource

I get a dialog:   Create a new Resource Definition
   
 
It looks like I don't have any default resource types defined.  All I've
got in the 'Resource Type' drop down list  is 'template'.  The admin
guide says there are some pre-defined resource types that look handy.
Do you have any definitions  already made up for linux for:

  IP Address
  filesystem

I'll also need a RAID resource.  I guess I need to crack open the
Programming guide.  I'll have to do it eventually for my application
anyway. 

-----------------------------------------------------------------------------

Just for yucks, I took a look at /var/cluster/ha/log.  It looks like
one of the files is growing quite a bit!

[root@dru1a log]# ls -l
total 23669
-rw-r--r--    1 root     root     23892291 Jan  2 22:43 cad_log
-rw-r--r--    1 root     root         5922 Jan  2 22:36 cli_dru1a
-rw-------    1 root     root       127777 Jan  2 22:36 cmond_log
-rw-r--r--    1 root     root         6173 Jan  2 22:24 cmsd_dru1a
-rw-r--r--    1 root     root         4782 Jan  2 22:15 crsd_dru1a
-rw-r--r--    1 root     root         8435 Jan  2 22:25 failsafe_dru1a
-rw-------    1 root     root        50639 Jan  2 22:16 fs2d_log
-rw-r--r--    1 root     root        36799 Jan  2 22:43 gcd_dru1a
-rw-r--r--    1 root     root         1153 Jan  2 22:16 srmd_dru1a

I take it there is some kind of debugging turned on in the 'cad'
deamon?!?  I only configured a 100MB var partition, so it looks like
I'll only be able to run my cluster for a few days :-) 


Regards,
-Eric.



---
(/var/log/messages excerpt)


Jan  2 22:10:10 dru1a PAM_pwdb[1593]: (su) session closed for user root
Jan  2 22:10:14 dru1a runpriv[1598]: Running privilege ClusterDiags for user 
root.
Jan  2 22:10:59 dru1a runpriv[1604]: Running privilege ClusterDiags for user 
root.
Jan  2 22:11:05 dru1a runpriv[1605]: Running privilege ClusterDiags for user 
root.
Jan  2 22:13:08 dru1a runpriv[1620]: Running privilege ClusterDiags for user 
root.
Jan  2 22:14:40 dru1a runpriv[1631]: Running privilege haParamsModify for user 
root.
Jan  2 22:14:40 dru1a cli[1631]: <<CI> E config 0> CI_ERR_INVAL, Internal 
error: inte
rnal argument is invalid : Internal error no nodes in cluster 
Jan  2 22:14:41 dru1a cli[1631]: <<CI> E config 0> CI_ERR_INVAL, CLI private 
command:
 failed (Internal error no nodes in cluster) 
Jan  2 22:14:53 dru1a runpriv[1637]: Running privilege clusterAddMachine for 
user roo
t.
Jan  2 22:14:56 dru1a cmond[537]: <cmond_cdb.c:477> Notification can not be 
processed
, local machine and cluster name is not known.
Jan  2 22:14:56 dru1a cmond[537]: <cmond_cdb.c:558> Local machine belongs to 
cluster 
dru.
Jan  2 22:14:56 dru1a cmond[537]: <cmond_cdb.c:579> Local machine name is dru1a.
Jan  2 22:15:02 dru1a cmond[537]: <cmond_cdb.c:910> Stale CDB handle.
Jan  2 22:15:02 dru1a crsd[549]: <<CI> N log 0> Additional crsd logs can be 
found in 
/var/cluster/ha/log/crsd_dru1a. 
Jan  2 22:15:21 dru1a runpriv[1692]: Running privilege haActivate for user root.
Jan  2 22:15:21 dru1a cmond[537]: <cmond_proc.c:142> New process ha_cmsd pid 
1702
Jan  2 22:15:21 dru1a cmond[537]: <cmond_proc.c:142> New process ha_gcd pid 1703
Jan  2 22:15:21 dru1a cmond[537]: <cmond_proc.c:142> New process ha_srmd pid 
1704
Jan  2 22:15:21 dru1a cmond[537]: <cmond_proc.c:142> New process ha_fsd pid 1706
Jan  2 22:15:22 dru1a ha_cmsd[1702]: <<CI> N log 0> Additional ha_cmsd logs can 
be fo
und in /var/cluster/ha/log/cmsd_dru1a. 
Jan  2 22:15:22 dru1a ha_gcd[1703]: <<CI> N log 0> Additional ha_gcd logs can 
be foun
d in /var/cluster/ha/log/gcd_dru1a. 
Jan  2 22:15:22 dru1a ha_cmsd[1702]: <<CI> N cms 0> ha_cmsd restarted. 
Jan  2 22:15:22 dru1a ha_fsd[1706]: <<CI> N log 0> Additional ha_fsd logs can 
be foun
d in /var/cluster/ha/log/failsafe_dru1a. 
Jan  2 22:15:22 dru1a ha_fsd[1706]: <<CI> N fsd 0> /usr/cluster/bin/ha_fsd is 
running
 as foreground process 
Jan  2 22:15:23 dru1a ha_srmd[1704]: <<CI> N log 0> Additional ha_srmd logs can 
be fo
und in /var/cluster/ha/log/srmd_dru1a. 
Jan  2 22:15:23 dru1a ha_cmsd[1702]: <<CI> N log 0> Additional ha_cmsd logs can 
be fo
und in /var/cluster/ha/log/cmsd_dru1a. 
Jan  2 22:15:23 dru1a ha_gcd[1703]: <<CI> N log 0> Additional ha_gcd logs can 
be foun
d in /var/cluster/ha/log/gcd_dru1a. 
Jan  2 22:15:23 dru1a ha_gcd[1703]: <<CI> N gcd 0> My node name = dru1a. 
Jan  2 22:15:23 dru1a ha_gcd[1703]: <<CI> E cms 0> CI_IPCERR_NOSERVER, cms ipc: 
ipccl
nt_connect() failed, file /var/cluster/ha/comm/cmsd-ipc_dru1a .Check if the 
cmsd daem
on is running. 
Jan  2 22:15:24 dru1a ha_gcd[1703]: <<CI> E cms 0> CI_IPCERR_NOSERVER, cms ipc: 
ipccl
nt_connect() failed, file /var/cluster/ha/comm/cmsd-ipc_dru1a .Check if the 
cmsd daem
on is running. 
Jan  2 22:15:26 dru1a ha_cmsd[1702]: <<CI> N cms 0> Confirmed Membership: sqn 1 
G_sqn
 = 1, ack false node dru1a [1] : UP  incarnation 1   age 1:0 node dru1b [2] : 
DOWN*  
incarnation 0   age 0:0 
Jan  2 22:15:27 dru1a ha_gcd[1703]: <<CI> N gcd 0> My nodeid = 1 [0x1]. 
Jan  2 22:15:46 dru1a ha_srmd[1733]: <<CI> N srm 2> SRM ready to accept clients 
Jan  2 22:16:30 dru1a ha_fsd[1706]: <<CI> N fsd 0> FailSafe initialization 
complete -
- Move to state: UP 
Jan  2 22:24:07 dru1a runpriv[1746]: Running privilege haActivate for user root.
Jan  2 22:24:08 dru1a ha_cmsd[1717]: <<CI> N log 1> Additional ha_cmsd logs can 
be fo
und in /var/cluster/ha/log/cmsd_dru1a. 
Jan  2 22:24:38 dru1a ha_cmsd[1702]: <<CI> N cms 0> Node dru1b id 2 
added/enabled.  
Jan  2 22:24:41 dru1a ha_cmsd[1702]: <<CI> N cms 0> Confirmed Membership: sqn 2 
G_sqn
 = 2, ack false node dru1a [1] : UP  incarnation 1   age 2:0 node dru1b [2] : 
UP  inc
arnation 1   age 1:0 

<Prev in Thread] Current Thread [Next in Thread>