Ulrich Windl | 12 Sep 08:06 2014
Picon

Re: Antw: Re: Postgresql RA fails starting master node

[...]
    If I use:  *ocf_log err "$OCF_RESKEY_config”*   in pgsql
    Where do I have to check this print? Because I’m not seeing it in
corosync.log.
[...]

It depends what log you configured. In my configuration (and probably yours
also) these messages should go to syslog. Maybe try ;-)
ocf_log err "HEY, LOOK here: $OCF_RESKEY_config”

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Federico Castro | 11 Sep 22:45 2014
Picon

Re: Antw: Re: Postgresql RA fails starting master node

Using ocf-tester I get:

ocf-tester -n pgsql -o repuser="ha" -o pgdba="postgres" -o
restart_on_promote="true" -o pgctl="/usr/lib/postgresql/9.1/bin/pg_ctl" -o
psql="/usr/lib/postgresql/9.1/bin/psql" -o
pgdata="/var/lib/postgresql/9.1/main/" -o
config="/etc/postgresql/9.1/main/postgresql.conf" -o rep_mode="async" -o
node_list="pz01 pz02" -o restore_command="cp
/var/lib/postgresql/9.1/main/archive/%f %p" -o
primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
keepalives_count=5" -o master_ip="10.10.10.80" -o stop_escalate="0"
/usr/lib/ocf/resource.d/heartbeat/pgsql
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/pgsql...
/usr/sbin/ocf-tester: 268: export: /var/lib/postgresql/9.1/main/archive/%f:
bad variable name

Is this the reason why I get `invalid parameter` ? Do you know what is
wrong there?

And without restore_command:

ocf-tester -n msPgsql -o repuser="ha" -o pgdba="postgres" -o
restart_on_promote="true" -o pgctl="/usr/lib/postgresql/9.1/bin/pg_ctl" -o
psql="/usr/lib/postgresql/9.1/bin/psql" -o
pgdata="/var/lib/postgresql/9.1/main/" -o
config="/etc/postgresql/9.1/main/postgresql.conf" -o rep_mode="async" -o
node_list="pz01 pz02" -o primary_conninfo_opt="keepalives_idle=60
keepalives_interval=5 keepalives_count=5" -o master_ip="10.10.10.80" -o
stop_escalate="0" /usr/lib/ocf/resource.d/heartbeat/pgsql
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/pgsql...
(Continue reading)

Federico Castro | 11 Sep 21:52 2014
Picon

Re: Antw: Re: Postgresql RA fails starting master node

Using ocf-tester I get:

ocf-tester -n pgsql -o repuser="ha" -o pgdba="postgres" -o
restart_on_promote="true" -o pgctl="/usr/lib/postgresql/9.1/bin/pg_ctl" -o
psql="/usr/lib/postgresql/9.1/bin/psql" -o
pgdata="/var/lib/postgresql/9.1/main/" -o
config="/etc/postgresql/9.1/main/postgresql.conf" -o rep_mode="async" -o
node_list="pz01 pz02" -o restore_command="cp
/var/lib/postgresql/9.1/main/archive/%f %p" -o
primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5
keepalives_count=5" -o master_ip="10.10.10.80" -o stop_escalate="0"
/usr/lib/ocf/resource.d/heartbeat/pgsql
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/pgsql...
/usr/sbin/ocf-tester: 268: export: /var/lib/postgresql/9.1/main/archive/%f:
bad variable name

Is this the reason why I get `invalid parameter` ? Do you know what is
wrong there?

And without restore_command:

ocf-tester -n msPgsql -o repuser="ha" -o pgdba="postgres" -o
restart_on_promote="true" -o pgctl="/usr/lib/postgresql/9.1/bin/pg_ctl" -o
psql="/usr/lib/postgresql/9.1/bin/psql" -o
pgdata="/var/lib/postgresql/9.1/main/" -o
config="/etc/postgresql/9.1/main/postgresql.conf" -o rep_mode="async" -o
node_list="pz01 pz02" -o primary_conninfo_opt="keepalives_idle=60
keepalives_interval=5 keepalives_count=5" -o master_ip="10.10.10.80" -o
stop_escalate="0" /usr/lib/ocf/resource.d/heartbeat/pgsql
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/pgsql...
(Continue reading)

Federico Castro | 10 Sep 20:49 2014
Picon

Postgresql RA fails starting master node

Hi all,

I´m working on a two node cluster with pacemaker and postgresql.
For some reason I don't really understand pgsql RA fails to start postgres
on first node.
If I start postgresql manually on my two nodes, then replication works
correctly.

I would really appreciate some clue on what to check from my installation
or configuration.

Thanks in advance.

I'm using:
OS: Debian 7
RA: resource-agents    1:3.9.2-5+deb7u2
      but using pgsql RA from
https://raw.githubusercontent.com/ClusterLabs/resource-agents/a6f4ddf76cb4bbc1b3df4c9b6632a6351b63c19e/heartbeat/pgsql
Pacemaker: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
Postgresql: 9.1+134wheezy4

CRM actual state:

============
Last updated: Thu Aug 28 12:58:51 2014
Last change: Thu Aug 28 12:58:46 2014 via crmd on pz01
Stack: openais
Current DC: pz01 - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, 2 expected votes
(Continue reading)

Alan Robertson | 10 Sep 02:31 2014

Anyone use Neo4j? Is there interest in a resource agent for Neo4j?

Hi,

I use the Neo4j graph database in the Assimilation project, and the
Assimilation code uses OCF RAs (among others).

So I wrote a neo4j resource agent.  I'm currently publishing it as part
of the Assimilation project code - because it's convenient for me.

If there is interest from others who would use it, I'd be happy to
provide it individually, or maybe even as part of the resource-agents
package.

The reason for me not to do that is that the distros lag far behind
(usually years behind) current source.  So even if I published it there,
it would likely be years before I could stop publishing my own copy.

[I confess I haven't yet written the metadata for the agent - because I
don't need it.  If someone wants to use it, I'd be happy to take a patch
with metadata, or *gasp* write it myself].

As an aside:
The reason I use a graph database (like Neo4j) is this: we model data
centers (servers, applications, networks, IPs, MACs, switch connections,
dependencies, etc) -- and almost all interesting questions about data
centers are naturally graph questions.

    -- Alan Robertson
       alanr <at> assimilationsystems.com OR alanr <at> unix.sh
       http://assimilationsystems.com/
_______________________________________________
(Continue reading)

Ulrich Windl | 9 Sep 16:20 2014
Picon

FYI: Patched ocf:pacemaker:ping RA

Hi!

Here's my patch I did today to the ping RA of pacemaker (current version fro mSLES11 SP3). Basically I wanted
the RA to use ping even if fping is found on the system. Anyway, here it is (edited, because ist on of 14
patches, all tabs expanded to spaces through copy from PuTTY and paste to Windows):
---
From 63f5d42d316f562a8c8ebc4bed6dff4859a9fc57 Mon Sep 17 00:00:00 2001
From: Ulrich Windl <Ulrich.Windl <at> RZ.Uni-Regensburg.DE>
Date: Tue, 9 Sep 2014 15:26:33 +0200
Subject: [PATCH 1/1] Changed ping from pacemaker (SLES11 SP3)

Change ping: Parameter "pidfile" is "unique" now.  Improve description of
"dampen" parameter.  Indicate the correct default for "multiplier" and
"attempts".  Add parameter "flavor" to select ping or fping.  Fix output of
ping_usage().  Use options also for fping.  Only use fping if ping was not
selected.
---
 ping        |   27 +-

diff --git a/ping b/ping
index b9a69b8..adb7682 100755
--- a/ping
+++ b/ping
 <at>  <at>  -40,7 +40,7  <at>  <at>  meta_data() {
 <?xml version="1.0"?>
 <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
 <resource-agent name="ping">
-<version>1.0</version>
+<version>1.1</version>

(Continue reading)

Ulrich Windl | 9 Sep 16:03 2014
Picon

Q: ocf-tester

Hi!

I modified the ping RA to meet my needs, and then I used ocf-tester to check it with the settings desired. I'm
wondering about the output; shoudln't ocf-tester query the metadata _before_ trying to use the methods,
i.e.: Don't use methods the RA doesn't announce:
---
Beginning tests for /usr/lib/ocf/resource.d/twuc/ping...
Testing permissions with uid nobody
Testing: meta-data
Testing: meta-data
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="ping">
<version>1.1</version>
[...]
<actions>
<action name="start"   timeout="60" />
<action name="stop"    timeout="20" />
<action name="reload"  timeout="100" />
<action name="monitor" depth="0"  timeout="60" interval="10"/>
<action name="meta-data"  timeout="5" />
<action name="validate-all"  timeout="30" />
</actions>
</resource-agent>
Testing: validate-all
Checking current state
Testing: monitor
Testing: monitor
Testing: start
Testing: monitor
(Continue reading)

Fabio M. Di Nitto | 8 Sep 12:30 2014
Picon

[RFC] Organizing HA Summit 2015

All,

it's been almost 6 years since we had a face to face meeting for all
developers and vendors involved in Linux HA.

I'd like to try and organize a new event and piggy-back with DevConf in
Brno [1].

DevConf will start Friday the 6th of Feb 2015 in Red Hat Brno offices.

My suggestion would be to have a 2 days dedicated HA summit the 4th and
the 5th of February.

The goal for this meeting is to, beside to get to know each other and
all social aspect of those events, tune the directions of the various HA
projects and explore common areas of improvements.

I am also very open to the idea of extending to 3 days, 1 one dedicated
to customers/users and 2 dedicated to developers, by starting the 3rd.

Thoughts?

Fabio

PS Please hit reply all or include me in CC just to make sure I'll see
an answer :)

[1] http://devconf.cz/
_______________________________________________
Linux-HA mailing list
(Continue reading)

Ulrich Windl | 8 Sep 09:19 2014
Picon

Q: dampening explained?

Hi!

I remember having asked this before, but I'l still missing a good explanation:

What are the precise semantics of "dampening" (attrd_updater -d)?

The manual page just says:
       -d, --delay=value
              The time to wait (dampening) in seconds further changes occur

Who is waiting? What changes?

Please explain!
(pacemaker-1.1.10-0.15.25 of SLES11 SP3)

Regards,
Ulrich

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Ulrich Windl | 4 Sep 14:32 2014
Picon

Q: ping (ocf:pacemaker:ping) from specific address?

Hi!

I'm using ocf:pacemaker:ping to ping some specific address. The requirement is that a specific source
address (that is on a local interface) is being used to ping a destination. So if the resource moves between
nodes, the same source address shall be used.

The address to use is a secondary adress on interface bond0, like bond0:xyz. I can spacify "-I bond0" and "-I
bond0:xyz", but the latter seems to be ignores, as if I just said "-I bond0". Ping is from
iputils-ss021109-292.28.1 (SLES11 SP3).

I had this requirement for HP-UX years ago, and the solution was to install a special local ping that allowed
to select the source address.

So my preferences are:
1) Get it done with the software I have
2) Get an update for the software I have to get it done
3) Install or develop software to get it done
4) Despair ;-)

Regards,
Ulrich

_______________________________________________
Linux-HA mailing list
Linux-HA <at> lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Brian Campbell | 3 Sep 01:29 2014

Two node Pacemaker with one Corosync only quorum node

I'm wondering if there are any problems that would occur if you ran a
cluster with only two nodes running Pacemaker, but add a third Corosync
only node to provide quorum.

I tried this setup, and it appears to work fine after some brief testing; I
configured Corosync and votequorum appropriately on all three nodes, but
only ever started Pacemaker on two of them. After enabling
no-quorum-policy=stop, if I disconnected one of the nodes it would stop
itself and the other would take over like I expect, rather than both nodes
trying to promote themselves as occurs when there are only two nodes and
no-quorum-policy=ignore (for the purposes of debugging and development, I
don't have stonith enabled in order to make it easier to monitor what's
going on at each node, without my connection dropping due to rebooting the
machine).

I'm now wondering if there will be any problems I haven't anticipated with
this setup, or anything I should look out for.

Of course, other options would involve having the third node simply running
Pacemaker but permanently in standby, or making it an asymmetric cluster
and only allowing any resources to run on the first two nodes. But I'm
curious if it's possible to go the simplest possible route and just have
corosync running on a third quorum node; or possibly even more.

Our setup has a couple of master nodes with large amounts of RAM so that
all of the metadata can fit into RAM, and then a number of cheap storage
nodes to store the actual bulk data. Because we have the cheap storage
nodes, we have a number of machines we can run as quorum-only nodes, but
don't want to ever accidentally select them as a master or slave node.

(Continue reading)


Gmane