Steven Hale | 18 Aug 22:29 2014
Picon

Fwd: VirtualDomain broken for live migration.

Dear all,

I'm in the process of setting up my first four-node cluster.  I'm
using CentOS7 with PCS/Pacemaker/Corosync.

I've got everything set up with shared storage using GlusterFS.  The
cluster is running and I'm in the process of adding resources.  My
intention for the cluster is to use it to host virtual machines.  I
want the cluster to be able to live-migrate VMs between hosts.  I'm
not interested in monitoring resources inside the guests, just knowing
that the guest is running or not is fine.

I've got all the virtualization working with libvirt using KVM.  Live
migration works fine.  Now I'm trying to make it work through the
cluster.

I am using the VirtualDomain resource in heartbeat.  I can add and
remove VMs.  It works.  But the live migration feature is broken.
Looking at the source, the fault is on this line:

  virsh ${VIRSH_OPTIONS} migrate --live $DOMAIN_NAME ${remoteuri} ${migrateuri}

I guess virsh must have changed at some point, because the "--live"
flag does not exist any more.  I can make it work with the following
change

  virsh ${VIRSH_OPTIONS} migrate --p2p --tunnelled $DOMAIN_NAME
${remoteuri} ${migrateuri}

This works, at least for my case where I'm tunnelling the migration
(Continue reading)

Wendt Christian | 1 Aug 18:40 2014

Oracle OCF Script throws "SP2-0640: Not connected"

Hello *,

i did a lot of research but i´m not able to get the purpose whether our oracle ressource fails on Wednesday.

Oracle start will fail with the message:

/usr/lib/ocf/resource.d/heartbeat/oracle start
INFO: orcSNBGW instance state is not OPEN (dbstat output: SP2-0640: Not connected)
ERROR: oracle instance orcSNBGW not started:

Showdbstat throws:

/usr/lib/ocf/resource.d/heartbeat/oracle showdbstat
Full output:
SP2-0640: Not connected
Stripped output:
<OPEN>
So the first method of showdbstat monitoring the DB fails, but the second one succeeds.

It is not possible to start oracle within the pacemaker cluster anymore. Everytime we start it, it´ll
fail. I´ve attached the bash output while starting oracle with the ocf script.

Database and OS is fine. Nothing changed in the last days.

Do have any ideas?

Thank you in advance.

Mit freundlichen Grüßen / Best regards

(Continue reading)

Robert.Koeppl | 2 Aug 10:00 2014

AUTO: Robert Koeppl ist außer Haus. Robert Koeppl is out of office (Rückkehr am 18.08.2014)


Ich kehre zurück am 18.08.2014.

Hinweis: Dies ist eine automatische Antwort auf Ihre Nachricht  "Re:
[Linux-HA] Virtual address for slave" gesendet am 02.08.2014 08:58:06.

Diese ist die einzige Benachrichtigung, die Sie empfangen werden, während
diese Person abwesend ist.

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

N, Ravikiran | 1 Aug 10:47 2014
Picon

Application level HA using heartbeat.. ??

Hi,

I was trying to understand how Heartbeat actually works. I found out that Heartbeat provides HA at a system
level rather than application level, meaning the Virtual IP is moved to backup system only on a
system-down-scenario and not when application monitored goes down. I verified this. I had configured
httpd in my haresources file and I manually stopped httpd using  "service httpd stop". Although this stops
httpd service, Heartbeat doesn't recognize this. ..!!
But when I stop heartbeat or bring down my machine, Backup comes up and starts httpd. Is this the correct
scenario.. if so, please let me know why I should provide a script in resources.d/ to start, stop and find
status of the application. Also, how can I achieve application level HA using heartbeat..!
Thanks in advance.. :)

Regards,
Ravikiran N

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

jarek | 1 Aug 09:39 2014
Picon

Virtual address for slave

Hello!

	I'd like to have two virtual adresses: vip-master and vip-slave.
vip-master should be bound to master mode, vip-slave should be bound to
slave node.
	How can I do it ?

Best regards
Jarek

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Ulrich Windl | 30 Jul 16:05 2014
Picon

Antw: Attention: Problematic Update for SLES11 (kernel)

Hello!

An update: The problem is known at SUSE and there is a temporary fix (PTF.876616) for this issue.
Unfortunately the kernel with the defect is newer than the PTF, i.e. the PTF is not included in the latest kernel.

Regards,
Ulrich

>>> Ulrich Windl schrieb am 30.07.2014 um 08:47 in Nachricht <53D894E7.ECA : 161 :
60728>:
> Hi!
> 
> I wanted to notify you that one of the recent updates for SLES11 SP3 may 
> cause trouble when using cLVM: On an updated node, cLVM won't start any more, 
> and the kernel will flood your syslog with messages like:
> 
> Jul 30 08:17:09 h05 kernel: [  563.700629] dlm: Trying to connect to 172.20
> .16.1
> Jul 30 08:17:09 h05 kernel: [  563.700836] dlm: Can't start SCTP association 
> - retrying
> Jul 30 08:17:09 h05 kernel: [  563.700843] dlm: Retry sending 48 bytes to 
> node id 17831084
> Jul 30 08:17:09 h05 kernel: [  563.700852] dlm: Retrying SCTP association 
> init for node 17831084
> 
> The issue will be investigated, but be prepared for trouble if you update 
> just one node in your cluster.
> 
> Regards,
> Ulrich
(Continue reading)

Ulrich Windl | 30 Jul 08:47 2014
Picon

Attention: Problematic Update for SLES11

Hi!

I wanted to notify you that one of the recent updates for SLES11 SP3 may cause trouble when using cLVM: On an
updated node, cLVM won't start any more, and the kernel will flood your syslog with messages like:

Jul 30 08:17:09 h05 kernel: [  563.700629] dlm: Trying to connect to 172.20
.16.1
Jul 30 08:17:09 h05 kernel: [  563.700836] dlm: Can't start SCTP association - retrying
Jul 30 08:17:09 h05 kernel: [  563.700843] dlm: Retry sending 48 bytes to node id 17831084
Jul 30 08:17:09 h05 kernel: [  563.700852] dlm: Retrying SCTP association init for node 17831084

The issue will be investigated, but be prepared for trouble if you update just one node in your cluster.

Regards,
Ulrich

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Ulrich Windl | 28 Jul 12:15 2014
Picon

crm: "INFO: object cli-ban-... cannot be represented in the CLI notation"

Hi!

In SLES11 SP3 I feel that the message in new in crm (I haven't noticed it in the past):
INFO: object cli-ban-grp_c11_db-on-h07 cannot be represented in the CLI notation

The constraint seems to be created by crm migrate itself! That makes it interesting!

crm(live)configure# show cli-ban-grp_c11_db-on-h07
xml <rsc_location id="cli-ban-grp_c11_db-on-h07" rsc="grp_c11_db" role="Started"> \
  <rule id="cli-ban-grp_c11_db-on-h07-rule" score="-INFINITY"> \
    <expression id="cli-ban-grp_c11_db-on-h07-expr" attribute="#uname" operation="eq"
value="h07"/> \
    <date_expression id="cli-ban-grp_c11_db-on-h07-lifetime" operation="lt" end="2014-07-11
18:29:12Z"/> \
  </rule> \
</rsc_location>

crmsh-1.2.6-0.33.1

Regards,
Ulrich

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Dang Zhiqiang | 28 Jul 11:18 2014

How to modify crm configure by command line

hi,
I want to modify op start timeout value through command line, but I search on internet find nothing.
I try crm_resource  comand, but I only modify params and meta.

root <at> host2:~# crm configure show test-ip
primitive test-ip ocf:openindiana:IPaddr \
        params ip="192.168.1.253" nic="igb0" cidr_netmask="24" \
        op start interval="0s" timeout="60s" on-fail="restart" \
        op monitor interval="10s" timeout="60s" on-fail="restart" \
        op stop interval="0s" timeout="60s" on-fail="stop" \
        meta target-role="Stopped"

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Dejan Muhamedagic | 21 Jul 16:49 2014
Picon

glue 1.0.12 released

Hello,

The current glue repository has been tagged as 1.0.12.

It's been a while since the release candidate 1.0.12-rc1. There
were a few minor fixes and additions in the meantime, mostly for
hb_report.

Please upgrade at the earliest possible opportunity.

You can get the 1.0.12 tarball here:

	http://hg.linux-ha.org/glue/archive/glue-1.0.12.tar.bz2

The ChangeLog is available here:

http://hg.linux-ha.org/glue/file/glue-1.0.12/ChangeLog

A set of rpms is also available at the openSUSE Build Service:*)

http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/

The packages at the openSUSE Build Service will not work with
pacemaker versions earlier than v1.1.8 because the LRM bits are
not compiled.

Many thanks to all contributors. Without you this release would
not have been possible.

Enjoy!
(Continue reading)

Nirmal Fernando | 21 Jul 13:48 2014

Errors when starting heartbeat on CentOS

Hi All,

I was trying to configure heart beat on 2 AWS EC2 instances (CentOS) and
currently facing following error [1].

Also the kernel version;

*[root <at> node01 stratos]# rpm -qa |egrep 'heartbeat|kernel-2.6'*
kernel-2.6.32-431.5.1.el6.x86_64
kernel-2.6.32-279.1.1.el6.x86_64
heartbeat-3.0.4-2.el6.x86_64
kernel-2.6.32-431.11.2.el6.x86_64
heartbeat-libs-3.0.4-2.el6.x86_64
kernel-2.6.32-431.17.1.el6.x86_64
kernel-2.6.32-431.20.3.el6.x86_64

Any help is appreciated.

[1]
Jul 21 10:22:25 node01 heartbeat: [3083]: info: **************************
Jul 21 10:22:25 node01 heartbeat: [3083]: info: Configuration validated.
Starting heartbeat 3.0.4
Jul 21 10:22:25 node01 heartbeat: [3084]: info: heartbeat: version 3.0.4
Jul 21 10:22:25 node01 heartbeat: [3084]: info: Heartbeat generation:
1405925294
Jul 21 10:22:25 node01 heartbeat: [3084]: info: glib: ucast: write socket
priority set to IPTOS_LOWDELAY on eth0
Jul 21 10:22:25 node01 heartbeat: [3084]: info: glib: ucast: bound send
socket to device: eth0

(Continue reading)


Gmane