Eric Robinson | 23 Jul 08:53 2016

After Startup, Can't Connect to CIB, Pacemaker Eventually Dies

I've created a 15 or so Corosync+Pacemaker clusters and never had this kind of issue.

These servers are running the following software

RHEL 6.3
pacemaker-libs-1.1.12-8.el6_7.2.x86_64
pacemaker-1.1.12-8.el6_7.2.x86_64
corosync-1.4.7-5.el6.x86_64
pacemaker-cluster-libs-1.1.12-8.el6_7.2.x86_64
pacemaker-cli-1.1.12-8.el6_7.2.x86_64
corosynclib-1.4.7-5.el6.x86_64
crmsh-2.0-1.el6.x86_64

Corosync starts fine and both nodes join the cluster.
Pacemaker appears to start fine, but 'crm configure show' produces the error...

[root <at> ha14b ~]# crm configure show
ERROR: running cibadmin -Ql: Could not establish cib_rw connection: Connection refused (111)
Signon to CIB failed: Transport endpoint is not connected
Init failed, could not perform requested operations
ERROR: configure: Missing requirements

After a short while Pacemaker dies...

[root <at> ha14b ~]# service pacemaker status
pacemakerd dead but pid file exists

The Pacemaker log shows the following...

[root <at> ha14a log]# cat pacemaker.log
(Continue reading)

Arnaud Legrand | 22 Mar 11:15 2016

[BUGS][PACEMAKER] Some bugs found on script : /usr/lib/ocf/resource.d/heartbeat/mysql

Hello, i have found two problems on the bash script :
/usr/lib/ocf/resource.d/heartbeat/mysql on pacemaker.

Bug #1

The function bellow contains an error :

get_read_only() {
    # Check if read-only is set
    local read_only_state
    read_only_state=`$MYSQL $MYSQL_OPTIONS_REPL \
        -e "SHOW VARIABLES" | grep read_only | awk '{print $2}'`
    if [ "$read_only_state" = "ON" ]; then
        return 0
    else
        return 1
    fi
}

*It doesn't catch the good var read_only, but static var innodb_read_only
which always on OFF. The op monitor fails permanently.*

*Fast Correction :*

get_read_only() {
    # Check if read-only is set
    local read_only_state
    read_only_state=`$MYSQL $MYSQL_OPTIONS_REPL \
        -e "SHOW VARIABLES" | grep -v '_read_only' | grep -v 'read_only_' |
grep 'read_only' | awk '{print $2}'`
(Continue reading)

Roberto Munoz Gomez | 30 Mar 10:22 2016

Corosync do not send traffic

Hello,

Due to a change in the switch in one of the datacenters now I have an odd
behaviour in the cluster.

I am using cman with corosync and pacemaker. The versions are:

pacemaker-1.1.10-14.el6_5.3.x86_64
corosync-1.4.1-15.el6_4.1.x86_64
cman-3.0.12.1-49.el6.x86_64

The problem is, when I launch /etc/init.d/pacemaker start and the cman_tool
launch corosync, I don't see any UDP traffic, so the cluster is "broken"

But if I launch manually the same command "corosync -f" I do see udp
traffic and the totem is correctly sent between nodes.

It all began with the change in the switch, but I set the tcpdump in the
hosts and I do not see traffic.

I have tried the multicast and unicast configuration, different network,
but all with the same behaviour.

What am I missing?

Best Regards

--

-- 
*Roberto Muñoz Gómez*

(Continue reading)

velmurugan murugesan | 14 Dec 14:15 2015
Picon

CIB not supported: validator 'pacemaker-2.0', release '3.0.9'

Hi,

I am new to HA.

I am facing following error while crm configure

ERROR: CIB not supported: validator 'pacemaker-2.0', release '3.0.9'
ERROR: You may try the upgrade command

Please help me resolve this issue.

Thanks,
Velmurugan
_______________________________________________
Linux-HA mailing list is closing down.
Please subscribe to users <at> clusterlabs.org instead.
http://clusterlabs.org/mailman/listinfo/users
_______________________________________________
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha

Lorenz Vanthillo | 10 Nov 15:55 2015

HA masters using pacemaker and virtual IP

I want to install OpenShift V3 using the advanced method. So I will create an environment with 2 masters and 3 nodes.
The masters need to be HA. So I want a pacemaker between them.
All my hosts are Amazon EC2 instances and they're using CentOS as OS.

The hostnames of my master are:
master1.example.com
master2.example.com

Their public IP's:
52.19.128.xx
52.18.90.xx

Their private IP's:
10.0.0.131
10.0.0.132

So they are in different subnets (public). Is it possbile to configure a virtual IP for them, and how?

	   		 	   		  
_______________________________________________
Linux-HA mailing list is closing down.
Please subscribe to users <at> clusterlabs.org instead.
http://clusterlabs.org/mailman/listinfo/users
_______________________________________________
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha

Dejan Bucar | 28 Oct 16:21 2015
Gravatar

Download of Cluster Glue package?

Hi,

The download link for cluster glue has stopped working,
http://hg.linux-ha.org/glue/archive/glue-1.0.12.tar.bz2. Is it down permanently?

br,
/dejan

_______________________________________________
Linux-HA mailing list is closing down.
Please subscribe to users <at> clusterlabs.org instead.
http://clusterlabs.org/mailman/listinfo/users
_______________________________________________
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha

Cristiano Coltro | 28 Oct 09:32 2015
Picon

ORACLE 12 and SLES HAE (Sles 11sp3)

Hi,
most of the SLES 11 sp3 with HAE are migrating Oracle Db.
The migration will be from Oracle 11 to Oracle 12

They have verified that the Oracles cluster resources actually supports  
- Oracle 10.2 and 11.2 
Command used: usando il comando *crm ra info ocf:heartbeat:SAPDatabase*
So seems they are out of support.
So I would like to know which version of cluster/SO/Agent supports Oracle 12.
AFAIK agents are tipically included in rpm.
# rpm -qf /usr/lib/ocf/resource.d/heartbeat/SAPDatabase
resource-agents-3.9.5-0.34.57
and there are NOT updates about that....in the channel.

Any Idea on that?
Thanks,
Crisitiano

________________

Cristiano Coltro
Premium Support Engineer

mail: cristiano.coltro <at> microfocus.com
phone +39 02 36634936
mobile +39 3351435589

____________________

(Continue reading)

Karthik | 30 Oct 11:02 2015
Picon

Pacemaker 10-15% CPU.

Hello,
  We are using Pacemaker to manage the services that run on a node, as part 
of a service management framework, and manage the nodes running the services 
as a cluster.  One service will be running as 1+1 and other services with be 
N+1. 

  During our testing, we see that the pacemaker processes are taking about 
10-15% of the CPU.  We would like to know if this is normal and could the 
CPU utilization be minimised.  

Sample Output of most used CPU process in a Active Manager is

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
189      15766 30.4  0.0  94616 12300 ?        Ss   18:01  48:15 
/usr/libexec/pacemaker/cib
189      15770 28.9  0.0 118320 20276 ?        Ss   18:01  45:53 
/usr/libexec/pacemaker/pengine
root     15768  2.6  0.0  76196  3420 ?        Ss   18:01   4:12 
/usr/libexec/pacemaker/lrmd
root     15767 15.5  0.0  95380  5764 ?        Ss   18:01  24:33 
/usr/libexec/pacemaker/stonithd

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
189      15766 30.5  0.0  94616 12300 ?        Ss   18:01  49:58 
/usr/libexec/pacemaker/cib
189      15770 29.0  0.0 122484 20724 ?        Rs   18:01  47:29 
/usr/libexec/pacemaker/pengine
root     15768  2.6  0.0  76196  3420 ?        Ss   18:01   4:21 
/usr/libexec/pacemaker/lrmd
root     15767 15.5  0.0  95380  5764 ?        Ss   18:01  25:25 
(Continue reading)

J. Echter | 23 Sep 14:38 2015
Picon

Cluster for HA VM's serving our local network

Hi,

i was using this guide 
https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial_-_Archive to 
set up my cluster for some services, all works pretty good.

I decided to use this cluster as a HA vm provider for my network.

I have a little, maybe silly, question.

The guide tells me to disable qemu default network, like this:

>
>       Disable the 'qemu' Bridge
>
> By default, libvirtd <https://alteeve.ca/w/Libvirtd> creates a bridge 
> called virbr0 designed to connect virtual machines to the first eth0 
> interface. Our system will not need this, so we will remove it now.
>
> If libvirtd has started, skip to the next step. If you haven't started 
> libvirtd yet, you can manually disable the bridge by blanking out the 
> config file.
>
> cat  /dev/null>/etc/libvirt/qemu/networks/default.xml
i skipped the step to create the bridge device, as it was not needed for 
my belongings.

> vim  /etc/sysconfig/network-scripts/ifcfg-vbr2
> # Internet-Facing Network - Bridge
> DEVICE="vbr2"
(Continue reading)

Dustinta Cristian | 4 Sep 09:13 2015
Picon

Add ipv6 address on Hearbeat

Hello,
I am using Heartbeat 2.1.3 on Solaris 10 OS. I already have configured the hearbeat service  , but I want
to configure Hearbeat on dualStack(ipv4-ipv6). As I said I already have configured the ip-s(ipv4) and I
was wondering there is an easy way to add an ipv6 address without major impacts(without any
reconfigurations etc)? I found /opt/heartbeat/etc/ha.d/resource.d/IPv6addr and I saw the usage of
this script: $0 <ip-address> $LEGAL_ACTIONS. So basically I should just run this script like:
/opt/heartbeat/etc/ha.d/resource.d/IPv6addr 2620:0:60:b008::87f7:a394 start ? Or there are
some more steps to configure an ipv6 address?
Regards

_______________________________________________
Linux-HA mailing list is closing down.
Please subscribe to users <at> clusterlabs.org instead.
http://clusterlabs.org/mailman/listinfo/users
_______________________________________________
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
Ben Collins | 23 Aug 19:55 2015
Gravatar

MySQL slaves don't come back up with slave config

I’ve configured 6 nodes as mysql master/slave using this config:

primitive p_mysql ocf:heartbeat:mysql \
        params socket="/var/run/mysqld/mysqld.sock" replication_user="slave"
replication_passwd=“XXXXX" test_user="test_user" test_passwd="test_pass" \
        op start interval="0" timeout="120s" \
        op stop interval="0" timeout="120s" \
        op monitor timeout="30s" interval="30s" role="Master" OCF_CHECK_LEVEL="10" \
        op monitor timeout="30s" interval="60s" role="Slave" OCF_CHECK_LEVEL="10"
primitive p_mysql-ip ocf:heartbeat:IPaddr \
        params ip="10.10.10.191" \
        op monitor interval="1s" timeout="20s" \
        op start interval="0" timeout="20s" \
        op stop interval="0" timeout="20s" \
        meta is-managed="true" resource-stickiness="500"
ms cl_mysql p_mysql
colocation co_ip-on-mysql inf: p_mysql-ip cl_mysql:Master

On the initial setup, everything looks good. The slaves are all reporting proper status. However, if I
reboot one of the slaves, even though it is reported in crm status as a slave, the mysql server shows that
slave status is not configured or started on that node and log shows:

Aug 23 08:44:35 [1204] app5       lrmd:     info: log_execute: 	executing - rsc:p_mysql action:start call_id:99
mysql(p_mysql)[1562]:	2015/08/23_08:44:35 INFO: MySQL is not running
mysql(p_mysql)[1562]:	2015/08/23_08:44:35 INFO: Creating PID dir: /var/run/mysqld
mysql(p_mysql)[1562]:	2015/08/23_08:44:35 INFO: MySQL is not running
mysql(p_mysql)[1562]:	2015/08/23_08:44:37 INFO: MySQL is not running
mysql(p_mysql)[1562]:	2015/08/23_08:44:41 INFO: No MySQL master present - clearing replication state
mysql(p_mysql)[1562]:	2015/08/23_08:44:41 ERROR: check_slave invoked on an instance that is not a
replication slave.
(Continue reading)


Gmane