Ranjan Gajare | 24 Nov 10:48 2014
Picon

Monitor a Pacemaker Cluster with ocf:pacemaker:ClusterMon and/or external-agent

I want to configure Event Notification with Monitoring Resources using
External Agent. I want to set notification on node failover from HA
perspective.
I follow below links 

1)https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/s1-eventnotification-HAAR.html

2)http://floriancrouzat.net/2013/01/monitor-a-pacemaker-cluster-with-ocfpacemakerclustermon-andor-external-agent/

3)http://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacemaker_Explained/s-notification-external.html

Configured ClusterMon resource as

# pcs resource create ClusterMon-External ClusterMon --clone user=root \
update=30 extra_options="-E /var/lib/pgsql/9.3/data/test.sh -e
172.26.126.100"

vim test.sh

if [[ ${CRM_notify_rc} != 0 && ${CRM_notify_task} == "monitor" ]] || [[
${CRM_notify_task} != "monitor" ]] ; then 
     # This trap is compliant with PACEMAKER MIB 
    # 
https://github.com/ClusterLabs/pacemaker/blob/master/extra/PCMK-MIB.txt 
     /usr/bin/snmptrap -v 2c -c public ${CRM_notify_recipient} ""
PACEMAKER-MIB::pacemakerNotification \ 
 	PACEMAKER-MIB::pacemakerNotificationNode s "${CRM_notify_node}" \ 
 	PACEMAKER-MIB::pacemakerNotificationResource s "${CRM_notify_rsc}" \ 
 	PACEMAKER-MIB::pacemakerNotificationOperation s "${CRM_notify_task}" \ 
 	PACEMAKER-MIB::pacemakerNotificationDescription s "${CRM_notify_desc}" \ 
(Continue reading)

ranjan | 19 Nov 11:54 2014
Picon

RHEL Server 6.6 HA Configuration

I was trying to install Corosync and Cman using
yum install -y pacemaker cman pcs ccs resource-agents

This works fine on Centos 6.3. Tried the same on Redhat Redhat Enterprise
Linux Server 6.6 and ran into issues. It gives error like

Loaded plugins: product-id, refresh-packagekit, rhnplugin, security,
subscription-manager
There was an error communicating with RHN.
RHN Satellite or RHN Classic support will be disabled.

Error Message:
        Please run rhn_register as root on this client
Error Class Code: 9
Error Class Info: Invalid System Credentials.
Explanation:
     An error has occurred while processing your request. If this problem
     persists please enter a bug report at bugzilla.redhat.com.
     If you choose to submit the bug report, please be sure to include
     details of what you were trying to do when this error occurred and
     details on how to reproduce this problem.

Setting up Install Process
No package pacemaker available.
No package cman available.
No package pcs available.
No package ccs available.
Nothing to do

centos.repo is as follows...
(Continue reading)

Vladislav Bogdanov | 17 Nov 15:20 2014

crmsh and 'no such resource agent' error

Hi Kristoffer, all,

It seems like with introduction of 'resource-discovery'
'symmetric-cluster=true' becomes not so strict in sense of resource
agents sets across nodes.

May be it is possible to add a config options to disable error messages
like:

got no meta-data, does this RA exist?
no such resource agent

Best,
Vladislav
_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Vladislav Bogdanov | 17 Nov 08:05 2014

crm configure show to a pipe

Hi Kristoffer, all,

running 'crm configure show > file' appends non-printable chars at the
end (at least if op_defaults is used):

...
property cib-bootstrap-options: \
    dc-version=1.1.12-c191bf3 \
    cluster-infrastructure=corosync \
    cluster-recheck-interval=10m \
    stonith-enabled=false \
    no-quorum-policy=freeze \
    last-lrm-refresh=1415955398 \
    maintenance-mode=false \
    stop-all-resources=false \
    stop-orphan-resources=true \
    have-watchdog=false
rsc_defaults rsc_options: \
    allow-migrate=false \
    failure-timeout=10m \
    migration-threshold=INFINITY \
    multiple-active=stop_start \
    priority=0
op_defaults op-options: \
    record-pending=true.[?1034h

Best,
Vladislav
_______________________________________________
Linux-HA mailing list
(Continue reading)

Randy S | 16 Nov 19:17 2014
Picon

time_longclock illumos

Hi all,

new user here. 
We have been testing an older version of the heartbeat / pacemaker combination compiled for illumos (an
opensolaris follow-up).
Versions:
Heartbeat-3-0-STABLE-3.0.5
Pacemaker-1-0-Pacemaker-1.0.11

It all works ok while testing (several months now) but I have noticed that every so often (and sometimes
quite frequently) I see the following console message appear:

crmd: [ID 996084 daemon.crit] [12637]: CRIT: time_longclock: old value was 298671305, new value is
298671304, diff is 1, callcount 141814

Now from what I have been able to find about this, is that this type of occurence should have been fixed in
heartbeat post 2.1.4 versions. At that time this occurence could make a cluster start behaving irratically.
We have two test implementions of a cluster, 1 in vmware and 1 on standard hardware. All just for testing.
We have made sure that timesync is done via ntp with the internet. The hardware implementation doesn't show
this message as many times as the vmware implementation, but still it appears (sometimes about three
times per 24 hours).

We haven't had any strange behaviour yet in the cluster, but my questions about this are as follows:

should we worry about this 'time_longclock' crit error eventhough it should have been fixed in version
post HA 3?

Is there something (simple) that can be done to prevent this type of error, or should we expect normal
cluster behaviour since ntp is used.

(Continue reading)

Andras POTOCZKY | 15 Nov 11:06 2014
Picon

application based HA

Hi List,

I would like to configure an application based HA solution. I mean if my 
important application stops working (but the active server is still 
alive), the HA configuration is passing the shared IP to the standby node.

I've found pacemaker is the tool for it but I couldnt find any document 
or example related to the application level HA.

Can somebody point me how can I figure this out?

thanks,
Andras
_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Vladislav Bogdanov | 12 Nov 18:26 2014

crmsh and 'resource-discovery'

Hi Kristoffer, Dejan.

Do you have plans to add support to crmsh for 'resource-discovery'
location constraint option (added to pacemaker by David in pull requests
#589 and #605) as well as for the 'pacemaker-next' schema (this one
seems to be trivial)?

Best,
Vladislav
_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Juan Perez | 11 Nov 22:54 2014
Picon

Heartbeat in Amazon VMs doest not create virtaul ip address

Hi, I installed on HeartBeat,Centos 6.5 on 2 Amazon EC2 machinesthis is the version:
[root <at> ip-10-0-2-68 ha.d]# rpm -qa | grep heartbeat
heartbeat-libs-3.0.4-2.el6.x86_64
heartbeat-3.0.4-2.el6.x86_64
heartbeat-devel-3.0.4-2.el6.x86_64

the floating IP is [root <at> ip-10-0-2-68 ha.d]# cat haresources
ip-10-0-2-68 10.0.2.70
but it is not created on any machine, it does not matter where I do the takeover or standby commands
what am I missing? is this even possible ? these are my setting in ha.cf
logfacility local0
ucast eth0 10.0.2.69
auto_failback on
node ip-10-0-2-68 ip-10-0-2-69
ping 10.0.2.1
use_logd yes
logfacility local0
ucast eth0 10.0.2.68
auto_failback on
node ip-10-0-2-68 ip-10-0-2-69
ping 10.0.2.1
use_logd yes

these is the output of the route command
[root <at> ip-10-0-2-68 ha.d]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.0.2.0        0.0.0.0         255.255.255.0   U     0     
0        0 eth0
0.0.0.0         10.0.2.1        0.0.0.0         UG    0     
(Continue reading)

Sihan Goi | 15 Oct 05:22 2014
Picon

Fwd: Linux HA setup for CentOS 6.5

Hi,

 Is there a tutorial showing how to get a basic Linux HA setup with
replicated storage (via DRBD) working on CentOS 6.5? I want to have mySQL
as the HA resource with the database replicated across the nodes. I've
scoured the web for one but it seems that I get stuck in each one somewhere.

 To elaborate, I have 2 CentOS 6.5 nodes configured with distinct hostnames
and static IPs. They are connected to a wireless AP, and can ping each
other.

 I tried following this guide -
http://clusterlabs.org/quickstart-redhat.html
 However, cman will not start when NetworkManager is running, and my nodes
cannot connect to the wireless AP without NetworkManager running. Am I
missing something or is that the stupidest dependency ever? How is a
cluster supposed to work when the nodes aren't connected to one another?

 I also tried following the "clusters from scratch" guide but that seems to
rely on systemctl calls which aren't available on CentOS 6.5.

Any help?

-- 
- Goi Sihan
goister <at> gmail.com

--

-- 
- Goi Sihan
goister <at> gmail.com
(Continue reading)

zhanghu@aggstor.com | 1 Oct 06:57 2014

one of three heartbeat Links always dead


Hi, all,
Recently I have encountered a problem in our production evironment, I googled for a long while but failed to
find an answer. ?Please help me.
Here is our configuration: we have two nodes (d02 and d03, centos 6.4 ,?heartbeat-3.0.4-1.el6.x86_64)
set as heartbeat peers, each has six GigE interfaces, two bonded as one, and we got three links: bond0,
bond1 and bond2.?bond0 and bond1 use bonding mode 6 (alb), and bond2 configured as bond mode 0. ?all
interfaces connected through a switch. ?Below is the heartbeat configurations:
d02 : ha.cf
logfacility  local7

keepalive 2

deadtime  30

initdead  120

node d02 d03

ucast bond0 10.1.205.3

ucast bond1 172.1.1.3

ucast bond2 192.168.128.3

auto_failback off

respawn root /usr/lib64/heartbeat/dopd

apiauth dopd uid=root gid=root
(Continue reading)

fayçal noushi | 26 Sep 19:43 2014
Picon

Fixed Issue using VIPArip agent

Hello,

  We found an issue when using the VIPArip resource agent:  It keeps
restarting (stops at the first monitor call, then comes back up again).

Here's my envirornment :
Cent OS 6.4
CMAN 3.0.12.1-59
corosync 2.3.3-1.1
Pacemaker 1.1.12
VIPArip RA version : 1.0

The issue was that the configuration file created had its permissions
looking like this (400) :
-r-------. 1 root root 369 Sep 26 17:42
/var/run/resource-agents/VIPArip-ripd.conf

These permissions were too restrictive for the resource agent. Here are the
generated logs:
Sep 26 16:39:17 [2080] node1.cluster       lrmd:   notice:
operation_finished:     res_VIPArip_ClusterRipIP_start_0:9037:stderr [
vty_read_config: failed to open configuration file
/var/run/resource-agents/VIPArip-ripd.conf: Permission denied ]
Sep 26 16:39:17 [2080] node1.cluster       lrmd:   notice:
operation_finished:     res_VIPArip_ClusterRipIP_start_0:9037:stderr [
can't open configuration file [/var/run/resource-agents/VIPArip-ripd.conf] ]

In order to override the default file creation permission, we added at line
105 (inside new_config_file) this line :
*umask 022*
(Continue reading)


Gmane