Dejan Muhamedagic | 15 May 12:02 2009

Re: Drbd disk don't run

Hi,

On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote:
> Hi, Dejan
> 
> thanks for attention
> following my cib xml conf
> I am newbie with pacemaker, any hint is very welcome! : D

The CIB as seen by crm:

primitive drbd11 ocf:heartbeat:drbd \
	params drbd_resource="drbd11" \
	op monitor interval="59s" role="Master" timeout="30s" \
	op monitor interval="60s" role="Slave" timeout="30s" \
	meta target-role="started" is-managed="true"
ms ms-drbd11 drbd11 \
	meta clone-max="2" notify="true" globally-unique="false" target-role="stopped"

The target-role attribute is defined for both the primitive and
the container (ms). You should remove the former:

crm configure edit drbd11

and remove all meta attributes (the whole "meta" part). And don't
forget to remove the backslash in the line above it.

Thanks,

Dejan
(Continue reading)

Rafael Emerick | 15 May 13:54 2009
Picon

Re: Drbd disk don't run

Hi, Dejan

The fist problem are solved, but now i have another.
When i try to start de ms-drbd11 resource i don't get any error, but in the crm_mon i get the log:

============
Last updated: Fri May 15 08:44:11 2009
Current DC: node1 (57e0232d-5b78-4a1a-976e-e5335ba8266d) - partition with quorum
Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ node1 node2 ]

Clone Set: drbdinit
        Started: [ node1 node2 ]

Failed actions:
    drbd11:0_start_0 (node=node1, call=9, rc=1, status=complete): unknown error
    drbd11_start_0 (node=node1, call=17, rc=1, status=complete): unknown error
    drbd11:1_start_0 (node=node2, call=9, rc=1, status=complete): unknown error
    drbd11_start_0 (node=node2, call=16, rc=1, status=complete): unknown error

So, in the messes log file, i get


May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_resources: No STONITH resources have been defined
May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node node1 is online
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:0_start_0 on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing failed op drbd11:0_start_0 on node1: unknown error
May 15 08:25:03 node1 pengine: [4749]: WARN: process_orphan_resource: Nothing known about resource drbd11 running on node1
May 15 08:25:03 node1 pengine: [4749]: info: log_data_element: create_fake_resource: Orphan resource <primitive id="drbd11" type="drbd" class="ocf" provider="heartbeat" />
May 15 08:25:03 node1 pengine: [4749]: info: process_orphan_resource: Making sure orphan drbd11 is stopped
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0 on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing failed op drbd11_start_0 on node1: unknown error
May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node node2 is online
May 15 08:25:03 node1 pengine: [4749]: info: find_clone: Internally renamed drbdi:0 on node2 to drbdi:1
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:1_start_0 on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing failed op drbd11:1_start_0 on node2: unknown error
May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0 on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing failed op drbd11_start_0 on node2: unknown error
May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Clone Set: drbdinit
May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Started: [ node1 node2 ]
May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Master/Slave Set: ms-drbd11
May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Stopped: [ drbd11:0 drbd11:1 ]
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has failed 1000000 times on node1
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: Forcing ms-drbd11 away from node1 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has failed 1000000 times on node1
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: Forcing drbd11 away from node1 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has failed 1000000 times on node2
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: Forcing ms-drbd11 away from node2 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has failed 1000000 times on node2
May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness: Forcing drbd11 away from node2 after 1000000 failures (max=1000000)
May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:0 cannot run anywhere
May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:1 cannot run anywhere
May 15 08:25:03 node1 pengine: [4749]: info: master_color: ms-drbd11: Promoted 0 instances of a possible 1 to master
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource drbdi:0      (Started node1)
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource drbdi:1      (Started node2)
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource drbd11:0     (Stopped)
May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource drbd11:1     (Stopped)


I had this problem with heartbeatV2, then i'm using pacemaker with the same error.
My idea is that the crm does the management of the drbd, ocfs2 and vmxen resources to maintain them working...


To drbd resource init, the Sonith must be configured?

Thank you!

On Fri, May 15, 2009 at 7:02 AM, Dejan Muhamedagic <dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org> wrote:
Hi,

On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote:
> Hi, Dejan
>
> thanks for attention
> following my cib xml conf
> I am newbie with pacemaker, any hint is very welcome! : D

The CIB as seen by crm:

primitive drbd11 ocf:heartbeat:drbd \
       params drbd_resource="drbd11" \
       op monitor interval="59s" role="Master" timeout="30s" \
       op monitor interval="60s" role="Slave" timeout="30s" \
       meta target-role="started" is-managed="true"
ms ms-drbd11 drbd11 \
       meta clone-max="2" notify="true" globally-unique="false" target-role="stopped"

The target-role attribute is defined for both the primitive and
the container (ms). You should remove the former:

crm configure edit drbd11

and remove all meta attributes (the whole "meta" part). And don't
forget to remove the backslash in the line above it.

Thanks,

Dejan

> thank you very much
> for the help
>
>
> On Fri, May 15, 2009 at 4:46 AM, Dejan Muhamedagic <dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org>wrote:
>
> > Hi,
> >
> > On Thu, May 14, 2009 at 05:13:50PM -0300, Rafael Emerick wrote:
> > > Hi, Dejan
> > >
> > > There is no two set of meta-attributes.
> > >
> > > I remove the ms-drbd11, add again and the error is the same:
> > > Error performing operation: Required data for this CIB API call not found
> >
> > Can you please post your CIB. As xml.
> >
> > Thanks,
> >
> > Dejan
> >
> > >
> > > Thanks,
> > >
> > >
> > > On Thu, May 14, 2009 at 3:43 PM, Dejan Muhamedagic <dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org
> > >wrote:
> > >
> > > > Hi,
> > > >
> > > > On Thu, May 14, 2009 at 03:18:15PM -0300, Rafael Emerick wrote:
> > > > > Hi,
> > > > >
> > > > > I'm tryng to make a cluster with xen-ha using drbd and ocfs2...
> > > > >
> > > > > I want that crm management all resources (xen machines, drbd disks
> > and
> > > > ocfs2
> > > > > filesystem ).
> > > > >
> > > > > First, a create a clone lsb resource to init drbd with gui interface.
> > > > > Now, I'm following this manual
> > > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 to
> > > > > create the drbd disk managemnt and after make the ocfs2 filesystem.
> > > > >
> > > > > So, when i run:
> > > > > # crm resource start ms-drbd11
> > > > > # Multiple attributes match name=target-role
> > > > > # Value: stopped        (id=ms-drbd11-meta_attributes-target-role)
> > > > > # Value: started        (id=drbd11-meta_attributes-target-role)
> > > > > # Error performing operation: Required data for this CIB API call not
> > > > found
> > > >
> > > > As it says, there are multiple matches for the attribute. Don't
> > > > know how it came to be. Perhaps you can
> > > >
> > > > crm configure edit ms-drbd11
> > > >
> > > > and drop one of them. It could also be that there are two sets of
> > > > meta-attributes.
> > > >
> > > > If crm can't edit the resource (in that case please report it)
> > > > then you can try:
> > > >
> > > > crm configure edit xml ms-drbd11
> > > >
> > > > Thanks,
> > > >
> > > > Dejan
> > > >
> > > > > My messages:
> > > > > May 14 15:07:11 node1 pengine: [4749]: info: get_fail count:
> > ms-drbd11
> > > > has
> > > > > failed 1000000 times on node2
> > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: common_apply_stickiness:
> > > > > Forcing ms-drbd11 away from node2 after 1000000 failures
> > (max=1000000)
> > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color: Resource
> > > > drbd11:0
> > > > > cannot run anywhere
> > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color: Resource
> > > > drbd11:1
> > > > > cannot run anywhere
> > > > > May 14 15:07:11 node1 pengine: [4749]: info: master_color: ms-drbd11:
> > > > > Promoted 0 instances of a possible 1 to master
> > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > resource
> > > > > drbdi:0      (Started node1)
> > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > resource
> > > > > drbdi:1      (Started node2)
> > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > resource
> > > > > drbd11:0     (Stopped)
> > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > resource
> > > > > drbd11:1     (Stopped)
> > > > >
> > > > >
> > > > > Thank you for any help!
> > > >
> > > > > _______________________________________________
> > > > > Pacemaker mailing list
> > > > > Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> > > >
> > > > _______________________________________________
> > > > Pacemaker mailing list
> > > > Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> >
> > > _______________________________________________
> > > Pacemaker mailing list
> > > Pacemaker <at> oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker-bpPCVL0QxUY@public.gmane.orgusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >

> _______________________________________________
> Pacemaker mailing list
> Pacemaker-BSnDWwoz/2YZj6S/xCzO9g@public.gmane.orglabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker


_______________________________________________
Pacemaker mailing list
Pacemaker-BSnDWwoz/2aRShoRxXF5/A@public.gmane.orgorg
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Raoul Bhatia [IPAX] | 15 May 15:53 2009
Picon

minor showscores.sh patch

> # ./showscores.sh
> Resource            Score     Node            Stickiness #Fail    Fail-Stickiness 
> 0                                                                                 
> rm: cannot remove `/tmp/dkshowscorestmpfile3dk': No such file or directory

--> patch to use "rm -f" instead of "rm" attached.

cheers,
raoul
--

-- 
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc.          email.          r.bhatia@...
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG         web.          http://www.ipax.at
Barawitzkagasse 10/2/2/11           email.            office@...
1190 Wien                           tel.               +43 1 3670030
FN 277995t HG Wien                  fax.            +43 1 3670030 15
____________________________________________________________________
Attachment (showscores_rm.patch): text/x-diff, 241 bytes
_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Nicholas Dronen | 15 May 16:47 2009
Picon

Some showscores.sh questions

Hi:

Is showscores.sh included in any particular RPM?  Or are users expected to download it from here?

    http://www.gossamer-threads.com/lists/linuxha/pacemaker/54640

I'm using CentOS 5.3 with pacemaker 1.0.2-11.

Whether it's in an RPM or not, could the author add a license header to it?  I know it's not a complicated program, but before my group can use it we need to know whether it's BSD, Apache, GPL, or some other license.  (It's a hassle, I know.)

Regards,

Nick


_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Dejan Muhamedagic | 15 May 17:01 2009

Re: Drbd disk don't run

Hi,

On Fri, May 15, 2009 at 08:54:31AM -0300, Rafael Emerick wrote:
> Hi, Dejan
> 
> The fist problem are solved, but now i have another.
> When i try to start de ms-drbd11 resource i don't get any error, but in the
> crm_mon i get the log:
> 
> ============
> Last updated: Fri May 15 08:44:11 2009
> Current DC: node1 (57e0232d-5b78-4a1a-976e-e5335ba8266d) - partition with
> quorum
> Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
> ============
> 
> Online: [ node1 node2 ]
> 
> Clone Set: drbdinit
>         Started: [ node1 node2 ]
> 
> Failed actions:
>     drbd11:0_start_0 (node=node1, call=9, rc=1, status=complete): unknown
> error
>     drbd11_start_0 (node=node1, call=17, rc=1, status=complete): unknown
> error
>     drbd11:1_start_0 (node=node2, call=9, rc=1, status=complete): unknown
> error
>     drbd11_start_0 (node=node2, call=16, rc=1, status=complete): unknown
> error
> 
> So, in the messes log file, i get
> 
> 
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_resources: No STONITH
> resources have been defined
> May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node
> node1 is online
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:0_start_0
> on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11:0_start_0 on node1: unknown error
> May 15 08:25:03 node1 pengine: [4749]: WARN: process_orphan_resource:
> Nothing known about resource drbd11 running on node1
> May 15 08:25:03 node1 pengine: [4749]: info: log_data_element:
> create_fake_resource: Orphan resource <primitive id="drbd11" type="drbd"
> class="ocf" provider="heartbeat" />
> May 15 08:25:03 node1 pengine: [4749]: info: process_orphan_resource: Making
> sure orphan drbd11 is stopped
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0
> on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11_start_0 on node1: unknown error
> May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node
> node2 is online
> May 15 08:25:03 node1 pengine: [4749]: info: find_clone: Internally renamed
> drbdi:0 on node2 to drbdi:1
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:1_start_0
> on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11:1_start_0 on node2: unknown error
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0
> on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11_start_0 on node2: unknown error
> May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Clone Set:
> drbdinit
> May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Started: [
> node1 node2 ]
> May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Master/Slave
> Set: ms-drbd11
> May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Stopped: [
> drbd11:0 drbd11:1 ]
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
> failed 1000000 times on node1
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing ms-drbd11 away from node1 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
> failed 1000000 times on node1
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing drbd11 away from node1 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
> failed 1000000 times on node2
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing ms-drbd11 away from node2 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
> failed 1000000 times on node2
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing drbd11 away from node2 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:0
> cannot run anywhere
> May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:1
> cannot run anywhere
> May 15 08:25:03 node1 pengine: [4749]: info: master_color: ms-drbd11:
> Promoted 0 instances of a possible 1 to master
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbdi:0      (Started node1)
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbdi:1      (Started node2)
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbd11:0     (Stopped)
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbd11:1     (Stopped)
> 
> 
> I had this problem with heartbeatV2, then i'm using pacemaker with the same
> error.
> My idea is that the crm does the management of the drbd, ocfs2 and vmxen

Can ocfs2 run on top of drbd? In that case you need master/master
resource. What you have is master/slave.

> resources to maintain them working...

It does, but this is a resource level problem. Funny that the
logs don't show much. You'll have to try by hand using drbdadm.

> To drbd resource init, the Sonith must be configured?

You must have stonith, in particular since it's shared storage.

Also, set

crm configure property no-quorum-policy=ignore

Thanks,

Dejan

> Thank you!
> 
> On Fri, May 15, 2009 at 7:02 AM, Dejan Muhamedagic <dejanmm@...>wrote:
> 
> > Hi,
> >
> > On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote:
> > > Hi, Dejan
> > >
> > > thanks for attention
> > > following my cib xml conf
> > > I am newbie with pacemaker, any hint is very welcome! : D
> >
> > The CIB as seen by crm:
> >
> > primitive drbd11 ocf:heartbeat:drbd \
> >        params drbd_resource="drbd11" \
> >        op monitor interval="59s" role="Master" timeout="30s" \
> >        op monitor interval="60s" role="Slave" timeout="30s" \
> >        meta target-role="started" is-managed="true"
> > ms ms-drbd11 drbd11 \
> >        meta clone-max="2" notify="true" globally-unique="false"
> > target-role="stopped"
> >
> > The target-role attribute is defined for both the primitive and
> > the container (ms). You should remove the former:
> >
> > crm configure edit drbd11
> >
> > and remove all meta attributes (the whole "meta" part). And don't
> > forget to remove the backslash in the line above it.
> >
> > Thanks,
> >
> > Dejan
> >
> > > thank you very much
> > > for the help
> > >
> > >
> > > On Fri, May 15, 2009 at 4:46 AM, Dejan Muhamedagic <dejanmm@...
> > >wrote:
> > >
> > > > Hi,
> > > >
> > > > On Thu, May 14, 2009 at 05:13:50PM -0300, Rafael Emerick wrote:
> > > > > Hi, Dejan
> > > > >
> > > > > There is no two set of meta-attributes.
> > > > >
> > > > > I remove the ms-drbd11, add again and the error is the same:
> > > > > Error performing operation: Required data for this CIB API call not
> > found
> > > >
> > > > Can you please post your CIB. As xml.
> > > >
> > > > Thanks,
> > > >
> > > > Dejan
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > On Thu, May 14, 2009 at 3:43 PM, Dejan Muhamedagic <
> > dejanmm@...
> > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Thu, May 14, 2009 at 03:18:15PM -0300, Rafael Emerick wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm tryng to make a cluster with xen-ha using drbd and ocfs2...
> > > > > > >
> > > > > > > I want that crm management all resources (xen machines, drbd
> > disks
> > > > and
> > > > > > ocfs2
> > > > > > > filesystem ).
> > > > > > >
> > > > > > > First, a create a clone lsb resource to init drbd with gui
> > interface.
> > > > > > > Now, I'm following this manual
> > > > > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 to
> > > > > > > create the drbd disk managemnt and after make the ocfs2
> > filesystem.
> > > > > > >
> > > > > > > So, when i run:
> > > > > > > # crm resource start ms-drbd11
> > > > > > > # Multiple attributes match name=target-role
> > > > > > > # Value: stopped
> >  (id=ms-drbd11-meta_attributes-target-role)
> > > > > > > # Value: started        (id=drbd11-meta_attributes-target-role)
> > > > > > > # Error performing operation: Required data for this CIB API call
> > not
> > > > > > found
> > > > > >
> > > > > > As it says, there are multiple matches for the attribute. Don't
> > > > > > know how it came to be. Perhaps you can
> > > > > >
> > > > > > crm configure edit ms-drbd11
> > > > > >
> > > > > > and drop one of them. It could also be that there are two sets of
> > > > > > meta-attributes.
> > > > > >
> > > > > > If crm can't edit the resource (in that case please report it)
> > > > > > then you can try:
> > > > > >
> > > > > > crm configure edit xml ms-drbd11
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Dejan
> > > > > >
> > > > > > > My messages:
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: get_fail count:
> > > > ms-drbd11
> > > > > > has
> > > > > > > failed 1000000 times on node2
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN:
> > common_apply_stickiness:
> > > > > > > Forcing ms-drbd11 away from node2 after 1000000 failures
> > > > (max=1000000)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> > Resource
> > > > > > drbd11:0
> > > > > > > cannot run anywhere
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> > Resource
> > > > > > drbd11:1
> > > > > > > cannot run anywhere
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: master_color:
> > ms-drbd11:
> > > > > > > Promoted 0 instances of a possible 1 to master
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbdi:0      (Started node1)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbdi:1      (Started node2)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbd11:0     (Stopped)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbd11:1     (Stopped)
> > > > > > >
> > > > > > >
> > > > > > > Thank you for any help!
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > Pacemaker mailing list
> > > > > > > Pacemaker@...
> > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Pacemaker mailing list
> > > > > > Pacemaker@...
> > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > >
> > > > > _______________________________________________
> > > > > Pacemaker mailing list
> > > > > Pacemaker@...
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> > > >
> > > > _______________________________________________
> > > > Pacemaker mailing list
> > > > Pacemaker@...
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> >
> > > _______________________________________________
> > > Pacemaker mailing list
> > > Pacemaker@...
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker@...
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >

> _______________________________________________
> Pacemaker mailing list
> Pacemaker@...
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Rafael Emerick | 15 May 18:58 2009
Picon

Re: Drbd disk don't run

With pacemaker i can't set up a state primary/primary?
I'm trying to run a disk now, then i wan't put than in primary/primary state.

With drbdadm i put the disk in working very well. The drbd+ocfs2 is already working, but now i want the pacemaker init the drbd and ocfs2/o2cb deamon, set the drbddisks in primary/primary, mount ocfs2 partition and then start the virtual machine...

The drbd, ocfs2 and vm are ok, lack only the pacemaker function for me to finish my graduation project... :( ...




On Fri, May 15, 2009 at 12:01 PM, Dejan Muhamedagic <dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org> wrote:
Hi,

On Fri, May 15, 2009 at 08:54:31AM -0300, Rafael Emerick wrote:
> Hi, Dejan
>
> The fist problem are solved, but now i have another.
> When i try to start de ms-drbd11 resource i don't get any error, but in the
> crm_mon i get the log:
>
> ============
> Last updated: Fri May 15 08:44:11 2009
> Current DC: node1 (57e0232d-5b78-4a1a-976e-e5335ba8266d) - partition with
> quorum
> Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
> ============
>
> Online: [ node1 node2 ]
>
> Clone Set: drbdinit
>         Started: [ node1 node2 ]
>
> Failed actions:
>     drbd11:0_start_0 (node=node1, call=9, rc=1, status=complete): unknown
> error
>     drbd11_start_0 (node=node1, call=17, rc=1, status=complete): unknown
> error
>     drbd11:1_start_0 (node=node2, call=9, rc=1, status=complete): unknown
> error
>     drbd11_start_0 (node=node2, call=16, rc=1, status=complete): unknown
> error
>
> So, in the messes log file, i get
>
>
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_resources: No STONITH
> resources have been defined
> May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node
> node1 is online
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:0_start_0
> on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11:0_start_0 on node1: unknown error
> May 15 08:25:03 node1 pengine: [4749]: WARN: process_orphan_resource:
> Nothing known about resource drbd11 running on node1
> May 15 08:25:03 node1 pengine: [4749]: info: log_data_element:
> create_fake_resource: Orphan resource <primitive id="drbd11" type="drbd"
> class="ocf" provider="heartbeat" />
> May 15 08:25:03 node1 pengine: [4749]: info: process_orphan_resource: Making
> sure orphan drbd11 is stopped
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0
> on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11_start_0 on node1: unknown error
> May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status: Node
> node2 is online
> May 15 08:25:03 node1 pengine: [4749]: info: find_clone: Internally renamed
> drbdi:0 on node2 to drbdi:1
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11:1_start_0
> on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11:1_start_0 on node2: unknown error
> May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op: drbd11_start_0
> on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
> May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> failed op drbd11_start_0 on node2: unknown error
> May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Clone Set:
> drbdinit
> May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Started: [
> node1 node2 ]
> May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Master/Slave
> Set: ms-drbd11
> May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Stopped: [
> drbd11:0 drbd11:1 ]
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
> failed 1000000 times on node1
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing ms-drbd11 away from node1 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
> failed 1000000 times on node1
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing drbd11 away from node1 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
> failed 1000000 times on node2
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing ms-drbd11 away from node2 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
> failed 1000000 times on node2
> May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> Forcing drbd11 away from node2 after 1000000 failures (max=1000000)
> May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:0
> cannot run anywhere
> May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource drbd11:1
> cannot run anywhere
> May 15 08:25:03 node1 pengine: [4749]: info: master_color: ms-drbd11:
> Promoted 0 instances of a possible 1 to master
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbdi:0      (Started node1)
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbdi:1      (Started node2)
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbd11:0     (Stopped)
> May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> drbd11:1     (Stopped)
>
>
> I had this problem with heartbeatV2, then i'm using pacemaker with the same
> error.
> My idea is that the crm does the management of the drbd, ocfs2 and vmxen

Can ocfs2 run on top of drbd? In that case you need master/master
resource. What you have is master/slave.

> resources to maintain them working...

It does, but this is a resource level problem. Funny that the
logs don't show much. You'll have to try by hand using drbdadm.

> To drbd resource init, the Sonith must be configured?

You must have stonith, in particular since it's shared storage.

Also, set

crm configure property no-quorum-policy=ignore

Thanks,

Dejan

> Thank you!
>
> On Fri, May 15, 2009 at 7:02 AM, Dejan Muhamedagic <dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org>wrote:
>
> > Hi,
> >
> > On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote:
> > > Hi, Dejan
> > >
> > > thanks for attention
> > > following my cib xml conf
> > > I am newbie with pacemaker, any hint is very welcome! : D
> >
> > The CIB as seen by crm:
> >
> > primitive drbd11 ocf:heartbeat:drbd \
> >        params drbd_resource="drbd11" \
> >        op monitor interval="59s" role="Master" timeout="30s" \
> >        op monitor interval="60s" role="Slave" timeout="30s" \
> >        meta target-role="started" is-managed="true"
> > ms ms-drbd11 drbd11 \
> >        meta clone-max="2" notify="true" globally-unique="false"
> > target-role="stopped"
> >
> > The target-role attribute is defined for both the primitive and
> > the container (ms). You should remove the former:
> >
> > crm configure edit drbd11
> >
> > and remove all meta attributes (the whole "meta" part). And don't
> > forget to remove the backslash in the line above it.
> >
> > Thanks,
> >
> > Dejan
> >
> > > thank you very much
> > > for the help
> > >
> > >
> > > On Fri, May 15, 2009 at 4:46 AM, Dejan Muhamedagic <dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org
> > >wrote:
> > >
> > > > Hi,
> > > >
> > > > On Thu, May 14, 2009 at 05:13:50PM -0300, Rafael Emerick wrote:
> > > > > Hi, Dejan
> > > > >
> > > > > There is no two set of meta-attributes.
> > > > >
> > > > > I remove the ms-drbd11, add again and the error is the same:
> > > > > Error performing operation: Required data for this CIB API call not
> > found
> > > >
> > > > Can you please post your CIB. As xml.
> > > >
> > > > Thanks,
> > > >
> > > > Dejan
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > On Thu, May 14, 2009 at 3:43 PM, Dejan Muhamedagic <
> > dejanmm-97jfqw80gc6171pxa8y+qA@public.gmane.org
> > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Thu, May 14, 2009 at 03:18:15PM -0300, Rafael Emerick wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm tryng to make a cluster with xen-ha using drbd and ocfs2...
> > > > > > >
> > > > > > > I want that crm management all resources (xen machines, drbd
> > disks
> > > > and
> > > > > > ocfs2
> > > > > > > filesystem ).
> > > > > > >
> > > > > > > First, a create a clone lsb resource to init drbd with gui
> > interface.
> > > > > > > Now, I'm following this manual
> > > > > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 to
> > > > > > > create the drbd disk managemnt and after make the ocfs2
> > filesystem.
> > > > > > >
> > > > > > > So, when i run:
> > > > > > > # crm resource start ms-drbd11
> > > > > > > # Multiple attributes match name=target-role
> > > > > > > # Value: stopped
> >  (id=ms-drbd11-meta_attributes-target-role)
> > > > > > > # Value: started        (id=drbd11-meta_attributes-target-role)
> > > > > > > # Error performing operation: Required data for this CIB API call
> > not
> > > > > > found
> > > > > >
> > > > > > As it says, there are multiple matches for the attribute. Don't
> > > > > > know how it came to be. Perhaps you can
> > > > > >
> > > > > > crm configure edit ms-drbd11
> > > > > >
> > > > > > and drop one of them. It could also be that there are two sets of
> > > > > > meta-attributes.
> > > > > >
> > > > > > If crm can't edit the resource (in that case please report it)
> > > > > > then you can try:
> > > > > >
> > > > > > crm configure edit xml ms-drbd11
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Dejan
> > > > > >
> > > > > > > My messages:
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: get_fail count:
> > > > ms-drbd11
> > > > > > has
> > > > > > > failed 1000000 times on node2
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN:
> > common_apply_stickiness:
> > > > > > > Forcing ms-drbd11 away from node2 after 1000000 failures
> > > > (max=1000000)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> > Resource
> > > > > > drbd11:0
> > > > > > > cannot run anywhere
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> > Resource
> > > > > > drbd11:1
> > > > > > > cannot run anywhere
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: master_color:
> > ms-drbd11:
> > > > > > > Promoted 0 instances of a possible 1 to master
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbdi:0      (Started node1)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbdi:1      (Started node2)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbd11:0     (Stopped)
> > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions: Leave
> > > > resource
> > > > > > > drbd11:1     (Stopped)
> > > > > > >
> > > > > > >
> > > > > > > Thank you for any help!
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > Pacemaker mailing list
> > > > > > > Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org
> > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Pacemaker mailing list
> > > > > > Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org
> > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > >
> > > > > _______________________________________________
> > > > > Pacemaker mailing list
> > > > > Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> > > >
> > > > _______________________________________________
> > > > Pacemaker mailing list
> > > > Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> >
> > > _______________________________________________
> > > Pacemaker mailing list
> > > Pacemaker <at> oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker-bpPCVL0QxUY@public.gmane.orgusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >

> _______________________________________________
> Pacemaker mailing list
> Pacemaker-BSnDWwoz/2YZj6S/xCzO9g@public.gmane.orglabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker


_______________________________________________
Pacemaker mailing list
Pacemaker-BSnDWwoz/2aRShoRxXF5/A@public.gmane.orgorg
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Dejan Muhamedagic | 15 May 19:08 2009

Re: Drbd disk don't run

Hi,

On Fri, May 15, 2009 at 01:58:36PM -0300, Rafael Emerick wrote:
> With pacemaker i can't set up a state primary/primary?

It all depends on the resource agent. The drbd OCF, I think,
works as primary/secondary. Perhaps somebody else can offer some
advice here, not really an expert on the drbd technology.

Thanks,

Dejan

> I'm trying to run a disk now, then i wan't put than in primary/primary
> state.
> 
> With drbdadm i put the disk in working very well. The drbd+ocfs2 is already
> working, but now i want the pacemaker init the drbd and ocfs2/o2cb deamon,
> set the drbddisks in primary/primary, mount ocfs2 partition and then start
> the virtual machine...
> 
> The drbd, ocfs2 and vm are ok, lack only the pacemaker function for me to
> finish my graduation project... :( ...
> 
> 
> 
> 
> On Fri, May 15, 2009 at 12:01 PM, Dejan Muhamedagic <dejanmm@...>wrote:
> 
> > Hi,
> >
> > On Fri, May 15, 2009 at 08:54:31AM -0300, Rafael Emerick wrote:
> > > Hi, Dejan
> > >
> > > The fist problem are solved, but now i have another.
> > > When i try to start de ms-drbd11 resource i don't get any error, but in
> > the
> > > crm_mon i get the log:
> > >
> > > ============
> > > Last updated: Fri May 15 08:44:11 2009
> > > Current DC: node1 (57e0232d-5b78-4a1a-976e-e5335ba8266d) - partition with
> > > quorum
> > > Version: 1.0.3-b133b3f19797c00f9189f4b66b513963f9d25db9
> > > 2 Nodes configured, unknown expected votes
> > > 2 Resources configured.
> > > ============
> > >
> > > Online: [ node1 node2 ]
> > >
> > > Clone Set: drbdinit
> > >         Started: [ node1 node2 ]
> > >
> > > Failed actions:
> > >     drbd11:0_start_0 (node=node1, call=9, rc=1, status=complete): unknown
> > > error
> > >     drbd11_start_0 (node=node1, call=17, rc=1, status=complete): unknown
> > > error
> > >     drbd11:1_start_0 (node=node2, call=9, rc=1, status=complete): unknown
> > > error
> > >     drbd11_start_0 (node=node2, call=16, rc=1, status=complete): unknown
> > > error
> > >
> > > So, in the messes log file, i get
> > >
> > >
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_resources: No STONITH
> > > resources have been defined
> > > May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status:
> > Node
> > > node1 is online
> > > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op:
> > drbd11:0_start_0
> > > on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> > > failed op drbd11:0_start_0 on node1: unknown error
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: process_orphan_resource:
> > > Nothing known about resource drbd11 running on node1
> > > May 15 08:25:03 node1 pengine: [4749]: info: log_data_element:
> > > create_fake_resource: Orphan resource <primitive id="drbd11" type="drbd"
> > > class="ocf" provider="heartbeat" />
> > > May 15 08:25:03 node1 pengine: [4749]: info: process_orphan_resource:
> > Making
> > > sure orphan drbd11 is stopped
> > > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op:
> > drbd11_start_0
> > > on node1 returned 1 (unknown error) instead of the expected value: 0 (ok)
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> > > failed op drbd11_start_0 on node1: unknown error
> > > May 15 08:25:03 node1 pengine: [4749]: info: determine_online_status:
> > Node
> > > node2 is online
> > > May 15 08:25:03 node1 pengine: [4749]: info: find_clone: Internally
> > renamed
> > > drbdi:0 on node2 to drbdi:1
> > > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op:
> > drbd11:1_start_0
> > > on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> > > failed op drbd11:1_start_0 on node2: unknown error
> > > May 15 08:25:03 node1 pengine: [4749]: info: unpack_rsc_op:
> > drbd11_start_0
> > > on node2 returned 1 (unknown error) instead of the expected value: 0 (ok)
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: unpack_rsc_op: Processing
> > > failed op drbd11_start_0 on node2: unknown error
> > > May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Clone Set:
> > > drbdinit
> > > May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Started: [
> > > node1 node2 ]
> > > May 15 08:25:03 node1 pengine: [4749]: notice: clone_print: Master/Slave
> > > Set: ms-drbd11
> > > May 15 08:25:03 node1 pengine: [4749]: notice: print_list:     Stopped: [
> > > drbd11:0 drbd11:1 ]
> > > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
> > > failed 1000000 times on node1
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> > > Forcing ms-drbd11 away from node1 after 1000000 failures (max=1000000)
> > > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
> > > failed 1000000 times on node1
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> > > Forcing drbd11 away from node1 after 1000000 failures (max=1000000)
> > > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: ms-drbd11 has
> > > failed 1000000 times on node2
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> > > Forcing ms-drbd11 away from node2 after 1000000 failures (max=1000000)
> > > May 15 08:25:03 node1 pengine: [4749]: info: get_failcount: drbd11 has
> > > failed 1000000 times on node2
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: common_apply_stickiness:
> > > Forcing drbd11 away from node2 after 1000000 failures (max=1000000)
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource
> > drbd11:0
> > > cannot run anywhere
> > > May 15 08:25:03 node1 pengine: [4749]: WARN: native_color: Resource
> > drbd11:1
> > > cannot run anywhere
> > > May 15 08:25:03 node1 pengine: [4749]: info: master_color: ms-drbd11:
> > > Promoted 0 instances of a possible 1 to master
> > > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> > > drbdi:0      (Started node1)
> > > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> > > drbdi:1      (Started node2)
> > > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> > > drbd11:0     (Stopped)
> > > May 15 08:25:03 node1 pengine: [4749]: notice: LogActions: Leave resource
> > > drbd11:1     (Stopped)
> > >
> > >
> > > I had this problem with heartbeatV2, then i'm using pacemaker with the
> > same
> > > error.
> > > My idea is that the crm does the management of the drbd, ocfs2 and vmxen
> >
> > Can ocfs2 run on top of drbd? In that case you need master/master
> > resource. What you have is master/slave.
> >
> > > resources to maintain them working...
> >
> > It does, but this is a resource level problem. Funny that the
> > logs don't show much. You'll have to try by hand using drbdadm.
> >
> > > To drbd resource init, the Sonith must be configured?
> >
> > You must have stonith, in particular since it's shared storage.
> >
> > Also, set
> >
> > crm configure property no-quorum-policy=ignore
> >
> > Thanks,
> >
> > Dejan
> >
> > > Thank you!
> > >
> > > On Fri, May 15, 2009 at 7:02 AM, Dejan Muhamedagic <dejanmm@...
> > >wrote:
> > >
> > > > Hi,
> > > >
> > > > On Fri, May 15, 2009 at 06:47:37AM -0300, Rafael Emerick wrote:
> > > > > Hi, Dejan
> > > > >
> > > > > thanks for attention
> > > > > following my cib xml conf
> > > > > I am newbie with pacemaker, any hint is very welcome! : D
> > > >
> > > > The CIB as seen by crm:
> > > >
> > > > primitive drbd11 ocf:heartbeat:drbd \
> > > >        params drbd_resource="drbd11" \
> > > >        op monitor interval="59s" role="Master" timeout="30s" \
> > > >        op monitor interval="60s" role="Slave" timeout="30s" \
> > > >        meta target-role="started" is-managed="true"
> > > > ms ms-drbd11 drbd11 \
> > > >        meta clone-max="2" notify="true" globally-unique="false"
> > > > target-role="stopped"
> > > >
> > > > The target-role attribute is defined for both the primitive and
> > > > the container (ms). You should remove the former:
> > > >
> > > > crm configure edit drbd11
> > > >
> > > > and remove all meta attributes (the whole "meta" part). And don't
> > > > forget to remove the backslash in the line above it.
> > > >
> > > > Thanks,
> > > >
> > > > Dejan
> > > >
> > > > > thank you very much
> > > > > for the help
> > > > >
> > > > >
> > > > > On Fri, May 15, 2009 at 4:46 AM, Dejan Muhamedagic <
> > dejanmm@...
> > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Thu, May 14, 2009 at 05:13:50PM -0300, Rafael Emerick wrote:
> > > > > > > Hi, Dejan
> > > > > > >
> > > > > > > There is no two set of meta-attributes.
> > > > > > >
> > > > > > > I remove the ms-drbd11, add again and the error is the same:
> > > > > > > Error performing operation: Required data for this CIB API call
> > not
> > > > found
> > > > > >
> > > > > > Can you please post your CIB. As xml.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Dejan
> > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 14, 2009 at 3:43 PM, Dejan Muhamedagic <
> > > > dejanmm@...
> > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > On Thu, May 14, 2009 at 03:18:15PM -0300, Rafael Emerick wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I'm tryng to make a cluster with xen-ha using drbd and
> > ocfs2...
> > > > > > > > >
> > > > > > > > > I want that crm management all resources (xen machines, drbd
> > > > disks
> > > > > > and
> > > > > > > > ocfs2
> > > > > > > > > filesystem ).
> > > > > > > > >
> > > > > > > > > First, a create a clone lsb resource to init drbd with gui
> > > > interface.
> > > > > > > > > Now, I'm following this manual
> > > > > > > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 to
> > > > > > > > > create the drbd disk managemnt and after make the ocfs2
> > > > filesystem.
> > > > > > > > >
> > > > > > > > > So, when i run:
> > > > > > > > > # crm resource start ms-drbd11
> > > > > > > > > # Multiple attributes match name=target-role
> > > > > > > > > # Value: stopped
> > > >  (id=ms-drbd11-meta_attributes-target-role)
> > > > > > > > > # Value: started
> >  (id=drbd11-meta_attributes-target-role)
> > > > > > > > > # Error performing operation: Required data for this CIB API
> > call
> > > > not
> > > > > > > > found
> > > > > > > >
> > > > > > > > As it says, there are multiple matches for the attribute. Don't
> > > > > > > > know how it came to be. Perhaps you can
> > > > > > > >
> > > > > > > > crm configure edit ms-drbd11
> > > > > > > >
> > > > > > > > and drop one of them. It could also be that there are two sets
> > of
> > > > > > > > meta-attributes.
> > > > > > > >
> > > > > > > > If crm can't edit the resource (in that case please report it)
> > > > > > > > then you can try:
> > > > > > > >
> > > > > > > > crm configure edit xml ms-drbd11
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Dejan
> > > > > > > >
> > > > > > > > > My messages:
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: get_fail count:
> > > > > > ms-drbd11
> > > > > > > > has
> > > > > > > > > failed 1000000 times on node2
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN:
> > > > common_apply_stickiness:
> > > > > > > > > Forcing ms-drbd11 away from node2 after 1000000 failures
> > > > > > (max=1000000)
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> > > > Resource
> > > > > > > > drbd11:0
> > > > > > > > > cannot run anywhere
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: WARN: native_color:
> > > > Resource
> > > > > > > > drbd11:1
> > > > > > > > > cannot run anywhere
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: info: master_color:
> > > > ms-drbd11:
> > > > > > > > > Promoted 0 instances of a possible 1 to master
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions:
> > Leave
> > > > > > resource
> > > > > > > > > drbdi:0      (Started node1)
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions:
> > Leave
> > > > > > resource
> > > > > > > > > drbdi:1      (Started node2)
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions:
> > Leave
> > > > > > resource
> > > > > > > > > drbd11:0     (Stopped)
> > > > > > > > > May 14 15:07:11 node1 pengine: [4749]: notice: LogActions:
> > Leave
> > > > > > resource
> > > > > > > > > drbd11:1     (Stopped)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thank you for any help!
> > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > Pacemaker mailing list
> > > > > > > > > Pacemaker@...
> > > > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > > > >
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > Pacemaker mailing list
> > > > > > > > Pacemaker@...
> > > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > > > >
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > Pacemaker mailing list
> > > > > > > Pacemaker@...
> > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Pacemaker mailing list
> > > > > > Pacemaker@...
> > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > >
> > > > > _______________________________________________
> > > > > Pacemaker mailing list
> > > > > Pacemaker@...
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> > > >
> > > > _______________________________________________
> > > > Pacemaker mailing list
> > > > Pacemaker@...
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> >
> > > _______________________________________________
> > > Pacemaker mailing list
> > > Pacemaker@...
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> > _______________________________________________
> > Pacemaker mailing list
> > Pacemaker@...
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >

> _______________________________________________
> Pacemaker mailing list
> Pacemaker@...
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Bob Haxo | 16 May 01:53 2009
Picon

trigger STONITH for testing purposes

Greetings,

What manual administrative actions can be used to trigger STONITH action? 

I have created a pair of STONITH resources (external/ipmi) and would like to test that these resources work as expected (which, if I understand the default correctly, is to reboot the node).

Thanks,
Bob Haxo
SGI

SLES11 HAE
_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Bob Haxo | 16 May 02:22 2009
Picon

Re: trigger STONITH for testing purposes

Ok, never mind this question.  "ifdown interface" works nicely to trigger STONITH action.

Unfortunately (if I may ask a new question) ... I now have one server rebooting, then the other rebooting, and back to the first rebooting in what looks to be an endless loop of reboots.

Suggestions?

Cheers,
Bob Haxo
SGI

On Fri, 2009-05-15 at 16:53 -0700, Bob Haxo wrote:
Greetings,

What manual administrative actions can be used to trigger STONITH action? 

I have created a pair of STONITH resources (external/ipmi) and would like to test that these resources work as expected (which, if I understand the default correctly, is to reboot the node).

Thanks,
Bob Haxo
SGI

SLES11 HAE _______________________________________________ Pacemaker mailing list Pacemaker-BSnDWwoz/2aRShoRxXF5/EB+6BGkLq7r@public.gmane.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Mark Hamzy | 16 May 02:29 2009
Picon

New patch for System Health feature

Here is attempt #3:

(See attached file: pacemaker.mark.patch)

I chose not to use pe_working_set_t* in char2score and instead used a small structure. I noticed a couple of things about crm_int_helper().
1) It doesn't have a way to indicate an error. It returns -1, but aren't signed integers valid input?
2) It returns long long but most of that is thrown away when converted to an int., uint32, or uint64.

Questions/comments?

Mark

Common Information Model/Web-Based Enterprise Management at http://www.openpegasus.org/
Take a look at the Linux Omni Printer Driver Framework at http://omniprint.sourceforge.net/

Attachment (pacemaker.mark.patch): application/octet-stream, 18 KiB
_______________________________________________
Pacemaker mailing list
Pacemaker@...
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Gmane