Rajagopal Swaminathan | 2 Jan 2009 06:49
Picon

Re: Re: Fencing test

Greetings,

On Wed, Dec 31, 2008 at 10:30 PM, Paras pradhan <pradhanparas <at> gmail.com> wrote:
>
> Pulled the heartbeat network cable from node1. Nothing happens. BUT
> when i plug the cable back , then node1 restarted. What am i misssing
> here.

The Heartbeat network cable should be out for at least 20-30 seconds.

If you have connected the data and heartbeat cable or in the same
switch, you may need to pull out both.

Incidently, you will have to enable multicasting for the heartbeat
network in the switch if it is managed switch and assign a seperate
VLAN for it. There have been cases in recent past where some of the
switches

> Also I don't see any thing interesting in /var/log/messages in
> node1 after i disconnect the cable.

Have you checked node2?

HTH

With warm regards

Rajagopal

(Continue reading)

Chrissie Caulfield | 2 Jan 2009 09:34
Picon
Favicon
Gravatar

Re: i rpmbuild the cman on linux as4 IBM power, it does not work.

victory.xu wrote:
> when i run " service cman start"
> the error in the /var/log/messages
> 
> 	kernel: ioctl32(cman_tool:5382): Unknown cmd fd(3) cmd(2000780b){' '} arg(42000422) on socket:[17147]

At a very quick guess that looks like the tools have been built as 32bit
and the kernel is 64 bit. There is no 32/64 compatibility layer in cman
for RHEL4, they must be the same word size.

> 	the ccsd has been started
> 
> 	i dont know why
> 
>         victory.xu
>         july_snow <at> 163.com
>           2008-12-29
> 
> --
> Linux-cluster mailing list
> Linux-cluster <at> redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

--

-- 

Chrissie

Paras pradhan | 2 Jan 2009 23:48
Picon

Re: Re: Fencing test

On Thu, Jan 1, 2009 at 11:49 PM, Rajagopal Swaminathan
<raju.rajsand <at> gmail.com> wrote:
> Greetings,

Thanks for following me up with your replies. I really appreciate this.
>
> On Wed, Dec 31, 2008 at 10:30 PM, Paras pradhan <pradhanparas <at> gmail.com> wrote:
>>
>> Pulled the heartbeat network cable from node1. Nothing happens. BUT
>> when i plug the cable back , then node1 restarted. What am i misssing
>> here.
>
> The Heartbeat network cable should be out for at least 20-30 seconds.

Yes waited more then 20-30 seconds (around 2,3 minutes). Didn;t
reboot. But as I said when I pushed the cable back to network port
then it reboots.

>
> If you have connected the data and heartbeat cable or in the same
> switch, you may need to pull out both.

Each of my nodes have one network interface card. So my heartbeat and
data cable is same and only one  if I understand you correctly.

>
> Incidently, you will have to enable multicasting for the heartbeat
> network in the switch if it is managed switch and assign a seperate
> VLAN for it. There have been cases in recent past where some of the
> switches
(Continue reading)

James Garratt | 5 Jan 2009 07:10
Picon

clvm running with redundant gnbd servers

I'm setting up a GNBD cluster with clvmd on the clients for the purpose of running a xen cluster. I've been playing with this for a few months now and I've almost got everything working, However I still have one outstanding issue that even after extensive searches of documentation and goggle I can't find an answer to.

My setup:
2 gnbd servers (running rhel5)
5 gnbd clients (running centos5)
GNBD servers are connected to a SAN via redundant paths.
Servers export multiple GNBDs with different names but with matching UIDs for each device they export.
Clients import all GNBDs from each server.
multipath.conf has been configured on the clients to see the GNBDs
lvm.conf has been configured on the clients to filter everything except the local disks and /dev/mpath/*

My problem is that if I put the two GNBD servers in the same cluster as the GNBD clients then I get warnings as the servers can't see the Volume Groups being used by the clients. If I put the servers in a separate cluster then fencing can not work properly in the event of a server crash and multipath locks up until the server is running again. Is there a way to tell clvm to ignore some of the cluster nodes or is there another solution to this problem?

Any advice or pointers to relevant documentation would be appreciated.

Regards,

James Garratt
<div>
<div>I'm setting up a GNBD cluster with clvmd on the clients for the purpose of running a xen cluster. I've been playing with this for a few months now and I've almost got everything working, However I still have one outstanding issue that even after extensive searches of documentation and&nbsp;goggle&nbsp;I can't find an answer to.</div>
<div><br></div>
<div>My setup:</div>
<div>2 gnbd servers (running rhel5)<br>
</div>
<div>5 gnbd clients (running centos5)</div>
<div>GNBD servers are connected to a SAN via redundant paths.</div>
<div>Servers export multiple GNBDs with different names but with matching UIDs for each device they export.</div>
<div>Clients import all GNBDs from each server.</div>
<div>multipath.conf has been configured on the clients to see the GNBDs</div>
<div>lvm.conf has been configured on the clients to filter everything except the local disks and /dev/mpath/*</div>
<div><br></div>
<div>My problem is that if I put the two GNBD servers in the same cluster as the GNBD clients then I get warnings as the servers can't see the Volume Groups being used by the clients. If I put the servers in a&nbsp;separate&nbsp;cluster then fencing can not work properly in the event of a server crash and multipath locks up&nbsp;until&nbsp;the server is running again. Is there a way to tell clvm to ignore some of the cluster nodes or is there another solution to this problem?</div>
<div><br></div>
<div>Any advice or pointers to relevant documentation would be&nbsp;appreciated.</div>
<div><br></div>
<div>Regards,</div>
<div><br></div>
<div>James Garratt</div>
</div>
Rajagopal Swaminathan | 5 Jan 2009 15:23
Picon

Re: Re: Fencing test

Greetings,

On Sat, Jan 3, 2009 at 4:18 AM, Paras pradhan <pradhanparas <at> gmail.com> wrote:
>
> Here I am using 4 nodes.
>
> Node 1) That runs luci
> Node 2) This is my iscsi shared storage where my virutal machine(s) resides
> Node 3) First node in my two node cluster
> Node 4) Second node in my two node cluster
>
> All of them are connected simply to an unmanaged 16 port switch.

Luci need not require a separate node to run. it can run on one of the
member nodes (node 3 | 4).

what does clustat say?

Can you post your cluster.conf here?

When you pull out the network cable *and* plug it back  in say node 3,
, what messages appear in the /var/log/messages if Node 4 (if any)?
(sorry for the repitition, but messages are necessary here to make any
sense of the situation)

HTH

With warm regards

Rajagopal

Paras pradhan | 5 Jan 2009 19:11
Picon

Re: Re: Fencing test

hi,

On Mon, Jan 5, 2009 at 8:23 AM, Rajagopal Swaminathan
<raju.rajsand <at> gmail.com> wrote:
> Greetings,
>
> On Sat, Jan 3, 2009 at 4:18 AM, Paras pradhan <pradhanparas <at> gmail.com> wrote:
>>
>> Here I am using 4 nodes.
>>
>> Node 1) That runs luci
>> Node 2) This is my iscsi shared storage where my virutal machine(s) resides
>> Node 3) First node in my two node cluster
>> Node 4) Second node in my two node cluster
>>
>> All of them are connected simply to an unmanaged 16 port switch.
>
> Luci need not require a separate node to run. it can run on one of the
> member nodes (node 3 | 4).

OK.

>
> what does clustat say?

Here is my clustat o/p:

-----------

[root <at> ha1lx ~]# clustat
Cluster Status for ipmicluster  <at>  Mon Jan  5 12:00:10 2009
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 10.42.21.29                                                         1
Online, rgmanager
 10.42.21.27                                                         2
Online, Local, rgmanager

 Service Name
Owner (Last)                                                     State
 ------- ----
----- ------                                                     -----
 vm:linux64
10.42.21.27
started
[root <at> ha1lx ~]#
------------------------

10.42.21.27 is node3 and 10.42.21.29 is node4

>
> Can you post your cluster.conf here?

Here is my cluster.conf

--
[root <at> ha1lx cluster]# more cluster.conf
<?xml version="1.0"?>
<cluster alias="ipmicluster" config_version="8" name="ipmicluster">
	<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
	<clusternodes>
		<clusternode name="10.42.21.29" nodeid="1" votes="1">
			<fence>
				<method name="1">
					<device name="fence2"/>
				</method>
			</fence>
		</clusternode>
		<clusternode name="10.42.21.27" nodeid="2" votes="1">
			<fence>
				<method name="1">
					<device name="fence1"/>
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<cman expected_votes="1" two_node="1"/>
	<fencedevices>
		<fencedevice agent="fence_ipmilan" ipaddr="10.42.21.28"
login="admin" name="fence1" passwd="admin"/>
		<fencedevice agent="fence_ipmilan" ipaddr="10.42.21.30"
login="admin" name="fence2" passwd="admin"/>
	</fencedevices>
	<rm>
		<failoverdomains>
			<failoverdomain name="myfd" nofailback="0" ordered="1" restricted="0">
				<failoverdomainnode name="10.42.21.29" priority="2"/>
				<failoverdomainnode name="10.42.21.27" priority="1"/>
			</failoverdomain>
		</failoverdomains>
		<resources/>
		<vm autostart="1" domain="myfd" exclusive="0" migrate="live"
name="linux64" path="/guest_roots" recovery="restart"/>
	</rm>
</cluster>
------

Here:

10.42.21.28 is IPMI interface in node3
10.42.21.30 is IPMI interface in node4

>
> When you pull out the network cable *and* plug it back  in say node 3,
> , what messages appear in the /var/log/messages if Node 4 (if any)?
> (sorry for the repitition, but messages are necessary here to make any
> sense of the situation)
>

Ok here is the log in node 4 after i disconnect the network cable in node3.

-----------

Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] The token was lost in the
OPERATIONAL state.
Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] Receive multicast socket
recv buffer size (288000 bytes).
Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] Transmit multicast socket
send buffer size (262142 bytes).
Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] entering GATHER state from 2.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering GATHER state from 0.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Creating commit token
because I am the rep.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Saving state aru 76 high
seq received 76
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Storing new sequence id
for ring ac
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering COMMIT state.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering RECOVERY state.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] position [0] member 10.42.21.29:
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] previous ring seq 168 rep
10.42.21.27
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] aru 76 high delivered 76
received flag 1
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Did not need to originate
any messages in recovery.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Sending initial ORF token
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] CLM CONFIGURATION CHANGE
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] New Configuration:
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.29)
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Left:
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.27)
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Joined:
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] CLM CONFIGURATION CHANGE
Jan  5 12:05:28 ha2lx kernel: dlm: closing connection to node 2
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] New Configuration:
Jan  5 12:05:28 ha2lx fenced[5004]: 10.42.21.27 not a cluster member
after 0 sec post_fail_delay
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.29)
Jan  5 12:05:28 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Trying to acquire journal lock...
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Left:
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Joined:
Jan  5 12:05:28 ha2lx openais[4988]: [SYNC ] This node is within the
primary component and will provide service.
Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering OPERATIONAL state.
Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] got nodejoin message 10.42.21.29
Jan  5 12:05:28 ha2lx openais[4988]: [CPG  ] got joinlist message from node 1
Jan  5 12:05:28 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Looking at journal...
Jan  5 12:05:29 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Acquiring the transaction lock...
Jan  5 12:05:29 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Replaying journal...
Jan  5 12:05:29 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Replayed 0 of 0 blocks
Jan  5 12:05:29 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Found 0 revoke tags
Jan  5 12:05:29 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:
jid=1: Journal replayed in 1s
Jan  5 12:05:29 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0: jid=1: Done
------------------

Now when I plug back my cable to node3, node 4 reboots and here is the
quickly grabbed log in node4

--
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] entering GATHER state from 11.
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] Saving state aru 1d high
seq received 1d
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] Storing new sequence id
for ring b0
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] entering COMMIT state.
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] entering RECOVERY state.
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] position [0] member 10.42.21.27:
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] previous ring seq 172 rep
10.42.21.27
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] aru 16 high delivered 16
received flag 1
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] position [1] member 10.42.21.29:
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] previous ring seq 172 rep
10.42.21.29
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] aru 1d high delivered 1d
received flag 1
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] Did not need to originate
any messages in recovery.
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] CLM CONFIGURATION CHANGE
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] New Configuration:
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.29)
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] Members Left:
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] Members Joined:
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] CLM CONFIGURATION CHANGE
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] New Configuration:
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.27)
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.29)
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] Members Left:
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] Members Joined:
Jan  5 12:07:12 ha2lx openais[4988]: [CLM  ] 	r(0) ip(10.42.21.27)
Jan  5 12:07:12 ha2lx openais[4988]: [SYNC ] This node is within the
primary component and will provide service.
Jan  5 12:07:12 ha2lx openais[4988]: [TOTEM] entering OPERATIONAL state.
Jan  5 12:07:12 ha2lx openais[4988]: [MAIN ] Killing node 10.42.21.27
because it has rejoined the cluster with existing state
Jan  5 12:07:12 ha2lx openais[4988]: [CMAN ] cman killed by node 2
because we rejoined the cluster without a full restart
Jan  5 12:07:12 ha2lx gfs_controld[5016]: groupd_dispatch error -1 errno 11
Jan  5 12:07:12 ha2lx gfs_controld[5016]: groupd connection died
Jan  5 12:07:12 ha2lx gfs_controld[5016]: cluster is down, exiting
Jan  5 12:07:12 ha2lx dlm_controld[5010]: cluster is down, exiting
Jan  5 12:07:12 ha2lx kernel: dlm: closing connection to node 1
Jan  5 12:07:12 ha2lx fenced[5004]: cluster is down, exiting
-------

Also here is the log of node3:

--
[root <at> ha1lx ~]# tail -f /var/log/messages
Jan  5 12:07:24 ha1lx openais[26029]: [TOTEM] entering OPERATIONAL state.
Jan  5 12:07:24 ha1lx openais[26029]: [CLM  ] got nodejoin message 10.42.21.27
Jan  5 12:07:24 ha1lx openais[26029]: [CLM  ] got nodejoin message 10.42.21.27
Jan  5 12:07:24 ha1lx openais[26029]: [CPG  ] got joinlist message from node 2
Jan  5 12:07:27 ha1lx ccsd[26019]: Attempt to close an unopened CCS
descriptor (4520670).
Jan  5 12:07:27 ha1lx ccsd[26019]: Error while processing disconnect:
Invalid request descriptor
Jan  5 12:07:27 ha1lx fenced[26045]: fence "10.42.21.29" success
Jan  5 12:07:27 ha1lx kernel: GFS2: fsid=ipmicluster:guest_roots.1:
jid=0: Trying to acquire journal lock...
Jan  5 12:07:27 ha1lx kernel: GFS2: fsid=ipmicluster:guest_roots.1:
jid=0: Looking at journal...
Jan  5 12:07:28 ha1lx kernel: GFS2: fsid=ipmicluster:guest_roots.1: jid=0: Done
----------------

> HTH
>
> With warm regards
>
> Rajagopal
>
> --
> Linux-cluster mailing list
> Linux-cluster <at> redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

Thanks a lot

Paras.

Greenseid, Joseph M. | 5 Jan 2009 21:18
Picon
Favicon

problem adding new node to an existing cluster

hi all,
 
i am trying to add a new node to an existing 3 node GFS cluster.
 
i followed the steps in the online docs for this, so i went onto the 1st node in my existing cluster, run system-config-cluster, added a new node and fence for it, then propagated that out to the existing nodes, and scp'd the cluster.conf file to the new node.
 
at that point, i confirmed that multipath and mdadm config files were synced with my other nodes, the new node can properly see the SAN that they're all sharing, etc. 
 
i then started cman, which seemed to start without any trouble.  i tried to start clvmd, but it says:
 
Activating VGs: Skipping clustered volume group san01
 
my VG is named "san01," so it can see the volume group, it just won't activate it for some reason.  any ideas what i'm doing wrong? 
 
thanks,
--Joe
<div>
<div>hi all,</div>
<div>&nbsp;</div>
<div>i am trying to add a new node to an existing 3 node GFS cluster.</div>
<div>&nbsp;</div>
<div>i followed the steps in the online docs for this, so i went onto the 1st node in my existing cluster, run system-config-cluster, added a new node and fence for it, then propagated that out to the existing nodes, and scp'd the cluster.conf file to the new node.</div>
<div>&nbsp;</div>
<div>at that point, i confirmed that multipath and mdadm config files were synced with my other nodes,&nbsp;the new node can&nbsp;properly see the SAN that they're all sharing, etc.&nbsp; </div>
<div>&nbsp;</div>
<div>i then started cman, which seemed to start without any trouble.&nbsp; i tried to start clvmd, but it says:</div>
<div>&nbsp;</div>
<div>Activating VGs: Skipping clustered volume group san01</div>
<div>&nbsp;</div>
<div>my VG is named "san01," so it can see the volume group, it just won't activate it for some reason.&nbsp; any ideas what i'm doing wrong?&nbsp; </div>
<div>&nbsp;</div>
<div>thanks,</div>
<div>--Joe</div>
</div>
Bob Peterson | 5 Jan 2009 21:25
Picon
Favicon

Re: problem adding new node to an existing cluster

----- "Joseph M. Greenseid" <Joseph.Greenseid <at> ngc.com> wrote:
| hi all,
| 
| i am trying to add a new node to an existing 3 node GFS cluster.
| 
| i followed the steps in the online docs for this, so i went onto the
| 1st node in my existing cluster, run system-config-cluster, added a
| new node and fence for it, then propagated that out to the existing
| nodes, and scp'd the cluster.conf file to the new node.
| 
| at that point, i confirmed that multipath and mdadm config files were
| synced with my other nodes, the new node can properly see the SAN that
| they're all sharing, etc.
| 
| i then started cman, which seemed to start without any trouble. i
| tried to start clvmd, but it says:
| 
| Activating VGs: Skipping clustered volume group san01
| 
| my VG is named "san01," so it can see the volume group, it just won't
| activate it for some reason. any ideas what i'm doing wrong?
| 
| thanks,
| --Joe 

Hi Joe,

Make sure that you have clvmd service running on the new node
("chkconfig clvmd on" and/or "service clvmd start" as necessary).
Also, make sure the lock_type is 2 (RHEL4/similar) or 3 (RHEL5/similar)
in the /etc/lvm/lvm.conf file.

Regards,

Bob Peterson
Red Hat GFS

Greenseid, Joseph M. | 5 Jan 2009 21:28
Picon
Favicon

RE: problem adding new node to an existing cluster

---- "Joseph M. Greenseid" <Joseph.Greenseid <at> ngc.com> wrote:
| hi all,
|
| i am trying to add a new node to an existing 3 node GFS cluster.
|
| i followed the steps in the online docs for this, so i went onto the
| 1st node in my existing cluster, run system-config-cluster, added a
| new node and fence for it, then propagated that out to the existing
| nodes, and scp'd the cluster.conf file to the new node.
|
| at that point, i confirmed that multipath and mdadm config files were
| synced with my other nodes, the new node can properly see the SAN that
| they're all sharing, etc.
|
| i then started cman, which seemed to start without any trouble. i
| tried to start clvmd, but it says:
|
| Activating VGs: Skipping clustered volume group san01
|
| my VG is named "san01," so it can see the volume group, it just won't
| activate it for some reason. any ideas what i'm doing wrong?
|
| thanks,
| --Joe

> Hi Joe,

> Make sure that you have clvmd service running on the new node
> ("chkconfig clvmd on" and/or "service clvmd start" as necessary).

Hi Bob, 

Yes, this problem started when I tried to start clvmd (/sbin/service clvmd start).

> Also, make sure the lock_type is 2 (RHEL4/similar) or 3 (RHEL5/similar)
> in the /etc/lvm/lvm.conf file.

Ah, Ok, I believe this may be the trouble.  My lock_type was 1.  I'll change it and try again.  Thanks.

--Joe

> Regards,

> Bob Peterson
> Red Hat GFS

Attachment (winmail.dat): application/ms-tnef, 4399 bytes
---- "Joseph M. Greenseid" <Joseph.Greenseid <at> ngc.com> wrote:
| hi all,
|
| i am trying to add a new node to an existing 3 node GFS cluster.
|
| i followed the steps in the online docs for this, so i went onto the
| 1st node in my existing cluster, run system-config-cluster, added a
| new node and fence for it, then propagated that out to the existing
| nodes, and scp'd the cluster.conf file to the new node.
|
| at that point, i confirmed that multipath and mdadm config files were
| synced with my other nodes, the new node can properly see the SAN that
| they're all sharing, etc.
|
| i then started cman, which seemed to start without any trouble. i
| tried to start clvmd, but it says:
|
| Activating VGs: Skipping clustered volume group san01
|
| my VG is named "san01," so it can see the volume group, it just won't
| activate it for some reason. any ideas what i'm doing wrong?
|
| thanks,
| --Joe

> Hi Joe,

> Make sure that you have clvmd service running on the new node
> ("chkconfig clvmd on" and/or "service clvmd start" as necessary).

Hi Bob, 

Yes, this problem started when I tried to start clvmd (/sbin/service clvmd start).

> Also, make sure the lock_type is 2 (RHEL4/similar) or 3 (RHEL5/similar)
> in the /etc/lvm/lvm.conf file.

Ah, Ok, I believe this may be the trouble.  My lock_type was 1.  I'll change it and try again.  Thanks.

--Joe

> Regards,

> Bob Peterson
> Red Hat GFS

Greenseid, Joseph M. | 5 Jan 2009 22:10
Picon
Favicon

RE: problem adding new node to an existing cluster

> Also, make sure the lock_type is 2 (RHEL4/similar) or 3 (RHEL5/similar)
> in the /etc/lvm/lvm.conf file.
 
This fixed it.  Thanks.
 
--Joe
 
<div>
<div dir="ltr">
<div dir="ltr">&gt; Also, make sure the lock_type is 2 (RHEL4/similar) or 3 (RHEL5/similar)<br>&gt; in the /etc/lvm/lvm.conf file.</div>
<div dir="ltr">&nbsp;</div>
</div>
<div dir="ltr">
<div dir="ltr">This fixed it.&nbsp; Thanks.</div>
<div dir="ltr">&nbsp;</div>
<div dir="ltr">--Joe</div>
<div dir="ltr">&nbsp;</div>
</div>
</div>

Gmane