Vladimir Melnik | 13 Dec 17:04 2014
Picon

The file on a GFS2-filesystem seems to be corrupted

Dear colleagues,

I encountered some very strange issue and would be grateful if you share
your thoughts on that.

I have a qcow2-image that is located at gfs2 filesystem on a cluster.
The cluster works fine and there are dozens of other qcow2-images, but,
as I can see, one of images seems to be corrupted.

First of all, it has quite unusual size:
> stat /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak
  File: `/mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak'
  Size: 7493992262336241664     Blocks: 821710640  IO Block: 4096   regular file
Device: fd06h/64774d    Inode: 220986752   Links: 1
Access: (0744/-rwxr--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2014-10-09 16:25:24.864877839 +0300
Modify: 2014-12-13 14:41:29.335603509 +0200
Change: 2014-12-13 15:52:35.986888549 +0200

By the way, I noticed that blocks' number looks rather okay.

Also qemu-img can't recognize it as an image:
> qemu-img info /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak
image: /mnt/sp1/ac2cb28f-09ac-4ca0-bde1-471e0c7276a0.bak
file format: raw
virtual size: 6815746T (7493992262336241664 bytes)
disk size: 392G

Disk size, although, looks more reasonable: the image's size is really
should be about 300-400G, as I remember.
(Continue reading)

Jürgen Ladstätter | 2 Dec 11:29 2014

Fencing and dead locks

Hi guys,

 

we’re running a 9 node cluster with 5 gfs2 mounts. The cluster is mainly used for load balancing web based applications. Fencing is done with IPMI and works.

Sometimes one server gets fenced, but after rebooting isn’t able to rejoin the cluster. This triggers higher load and many open processes, leading to another server being fenced. This server then isn’t able to rejoin either and this continues until we lose quorum and have to manually restart the whole cluster.

Sadly this is not reproducible, but it looks like it happens more often when there is more write IO.

 

Since a whole cluster deadlock kinda removes the sense of a cluster, we’d need some input what we could do or change.

We’re running Centos 6.6, kernel 2.6.32-504.1.3.el6.x86_64

 

Did anyone of you test gfs2 with centos 7? Any known major bugs that could cause dead locks?

 

Thanks in advance, Jürgen

 

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
Megan . | 1 Dec 15:16 2014
Picon

new cluster acting odd

Good Day,

I'm fairly new to the cluster world so i apologize in advance for
silly questions.  Thank you for any help.

We decided to use this cluster solution in order to share GFS2 mounts
across servers.  We have a 7 node cluster that is newly setup, but
acting oddly.  It has 3 vmware guest hosts and 4 physical hosts (dells
with Idracs).  They are all running Centos 6.6.  I have fencing
working (I'm able to do fence_node node and it will fence with
success).  I do not have the gfs2 mounts in the cluster yet.

When I don't touch the servers, my cluster looks perfect with all
nodes online. But when I start testing fencing, I have an odd problem
where i end up with split brain between some of the nodes.  They won't
seem to automatically fence each other when it gets like this.

in the  corosync.log for the node that gets split out i see the totem
chatter, but it seems confused and just keeps doing the below over and
over:

Dec 01 12:39:15 corosync [TOTEM ] Retransmit List: 22 24 25 26 27 28 29 2a 2b 2c

Dec 01 12:39:17 corosync [TOTEM ] Retransmit List: 22 24 25 26 27 28 29 2a 2b 2c

Dec 01 12:39:19 corosync [TOTEM ] Retransmit List: 22 24 25 26 27 28 29 2a 2b 2c

Dec 01 12:39:39 corosync [TOTEM ] Retransmit List: 1 3 4 5 6 7 8 9 a b

Dec 01 12:39:39 corosync [TOTEM ] Retransmit List: 1 3 4 5 6 7 8 9 a b
21 23 24 25 26 27 28 29 2a 2b 32
..
..
..
Dec 01 12:54:49 corosync [TOTEM ] Retransmit List: 1 3 4 5 6 7 8 9 a b
1d 1f 20 21 22 23 24 25 26 27 2e 30 31 32 37 38 39 3a 3b 3c

Dec 01 12:54:50 corosync [TOTEM ] Retransmit List: 1 3 4 5 6 7 8 9 a b
1d 1f 20 21 22 23 24 25 26 27 2e 30 31 32 37 38 39 3a 3b 3c

Dec 01 12:54:50 corosync [TOTEM ] Retransmit List: 1 3 4 5 6 7 8 9 a b
1d 1f 20 21 22 23 24 25 26 27 2e 30 31 32 37 38 39 3a 3b 3c

I can manually fence it, and it still comes online with the same
issue.  I end up having to take the whole cluster down, sometimes
forcing reboot on some nodes, then brining it back up.  Its takes a
good part of the day just to bring the whole cluster online again.

I used ccs -h node --sync --activate and double checked to make sure
they are all using the same version of the cluster.conf file.

Once issue I did notice, is that when one of the vmware hosts is
rebooted, the time comes off slitty skewed (6 seconds) but i thought i
read somewhere that a skew that minor shouldn't impact the cluster.

We have multicast enabled on the interfaces

          UP BROADCAST RUNNING MASTER MULTICAST  MTU:9000  Metric:1
and we have been told by our network team that IGMP snooping is disabled.

With tcpdump I can see the multi-cast traffic chatter.

Right now:

[root <at> data1-uat ~]# clustat
Cluster Status for projectuat  <at>  Mon Dec  1 13:56:39 2014
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 archive1-uat.domain.com                                1 Online
 admin1-uat.domain.com                                  2 Online
 mgmt1-uat.domain.com                                   3 Online
 map1-uat.domain.com                                    4 Online
 map2-uat.domain.com                                    5 Online
 cache1-uat.domain.com                                  6 Online
 data1-uat.domain.com                                   8 Online, Local

** Has itself ass online **
[root <at> map1-uat ~]# clustat
Cluster Status for projectuat  <at>  Mon Dec  1 13:57:07 2014
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 archive1-uat.domain.com                                1 Online
 admin1-uat.domain.com                                  2 Online
 mgmt1-uat.domain.com                                   3 Online
 map1-uat.domain.com                                    4 Offline, Local
 map2-uat.domain.com                                    5 Online
 cache1-uat.domain.com                                  6 Online
 data1-uat.domain.com                                   8 Online

[root <at> cache1-uat ~]# clustat
Cluster Status for projectuat  <at>  Mon Dec  1 13:57:39 2014
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 archive1-uat.domain.com                                1 Online
 admin1-uat.domain.com                                  2 Online
 mgmt1-uat.domain.com                                   3 Online
 map1-uat.domain.com                                    4 Online
 map2-uat.domain.com                                    5 Online
 cache1-uat.domain.com                                  6 Offline, Local
 data1-uat.domain.com                                   8 Online

[root <at> mgmt1-uat ~]# clustat
Cluster Status for projectuat  <at>  Mon Dec  1 13:58:04 2014
Member Status: Inquorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 archive1-uat.domain.com                                1 Offline
 admin1-uat.domain.com                                  2 Offline
 mgmt1-uat.domain.com                                   3 Online, Local
 map1-uat.domain.com                                    4 Offline
 map2-uat.domain.com                                    5 Offline
 cache1-uat.domain.com                                  6 Offline
 data1-uat.domain.com                                   8 Offline

cman-3.0.12.1-68.el6.x86_64

[root <at> data1-uat ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="66" name="projectuat">
<clusternodes>
<clusternode name="admin1-uat.domain.com" nodeid="2">
<fence>
<method name="fenceadmin1uat">
<device name="vcappliancesoap" port="admin1-uat" ssl="on"
uuid="421df3c4-a686-9222-366e-9a67b25f62b2"/>
</method>
</fence>
</clusternode>
<clusternode name="mgmt1-uat.domain.com" nodeid="3">
<fence>
<method name="fenceadmin1uat">
<device name="vcappliancesoap" port="mgmt1-uat" ssl="on"
uuid="421d5ff5-66fa-5703-66d3-97f845cf8239"/>
</method>
</fence>
</clusternode>
<clusternode name="map1-uat.domain.com" nodeid="4">
<fence>
<method name="fencemap1uat">
<device name="idracmap1uat"/>
</method>
</fence>
</clusternode>
<clusternode name="map2-uat.domain.com" nodeid="5">
<fence>
<method name="fencemap2uat">
<device name="idracmap2uat"/>
</method>
</fence>
</clusternode>
<clusternode name="cache1-uat.domain.com" nodeid="6">
<fence>
<method name="fencecache1uat">
<device name="idraccache1uat"/>
</method>
</fence>
</clusternode>
<clusternode name="data1-uat.domain.com" nodeid="8">
<fence>
<method name="fencedata1uat">
<device name="idracdata1uat"/>
</method>
</fence>
</clusternode>
<clusternode name="archive1-uat.domain.com" nodeid="1">
<fence>
<method name="fenceadmin1uat">
<device name="vcappliancesoap" port="archive1-uat" ssl="on"
uuid="421d16b2-3ed0-0b9b-d530-0b151d81d24e"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice agent="fence_vmware_soap" ipaddr="x.x.x.130"
login="fenceuat" login_timeout="10" name="vcappliancesoap"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="10"
power_wait="30" retry_on="3" shell_timeout="10" ssl="1"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"
ipaddr="x.x.x.47" login="fenceuat" name="idracdata1uat"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="60"
power_wait="60" retry_on="10" secure="on" shell_timeout="10"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"
ipaddr="x.x.x.48" login="fenceuat" name="idracdata2uat"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="60"
power_wait="60" retry_on="10" secure="on" shell_timeout="10"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"
ipaddr="x.x.x.82" login="fenceuat" name="idracmap1uat"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="60"
power_wait="60" retry_on="10" secure="on" shell_timeout="10"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"
ipaddr="x.x.x.96" login="fenceuat" name="idracmap2uat"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="60"
power_wait="60" retry_on="10" secure="on" shell_timeout="10"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"
ipaddr="x.x.x.83" login="fenceuat" name="idraccache1uat"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="60"
power_wait="60" retry_on="10" secure="on" shell_timeout="10"/>
<fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;"
ipaddr="x.x.x.97" login="fenceuat" name="idraccache2uat"
passwd_script="/etc/cluster/forfencing.sh" power_timeout="60"
power_wait="60" retry_on="10" secure="on" shell_timeout="10"/>
</fencedevices>
</cluster>

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Rajat | 28 Nov 06:51 2014
Picon

Cluster Overhead I/O, Network, Memory, CPU

Hey Team,

Our customer is using RHEL 5.X and RHEL 6.X as Cluster in they production stack.

Customer is looking is there any doc/white paper which can share they management as cluster service usages on
Disk                        %
Network                %
Memory                 %
CPU                        %

Gratitude


--
--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
Pradipta Singha | 12 Nov 13:10 2014
Picon

Deployment of Redhat cluster setup 6 to provide HA to oracle 11g R2

Hi Team,

I have to setup 2 node  Redhat cluster 6  to provide HA to oracle 11g R2 database with two instance.Kindly help me to setup the cluster.

Below  shared file system (shared in both node ) are for data file.

/dev/mapper/vg1-lv3                   gfs2   250G  2.2G  248G   1% /u01
/dev/mapper/vg1-lv4                   gfs2   175G  268M  175G   1% /u02
/dev/mapper/vg1-lv5                   gfs2    25G  259M   25G   2% /u03
/dev/mapper/vg1-lv6                   gfs2    25G  259M   25G   2% /u04
/dev/mapper/vg1-lv7                   gfs2    25G  259M   25G   2% /u05
/dev/mapper/vg1-lv8                   gfs2   300G  259M  300G   1% /u06
/dev/mapper/vg1- lv9                   gfs2   300G  1.8G  299G   1% /u07

And below  local file system (both are local to both the node) are  for database binary on both node-
/dev/mapper/vg2-lv1_oracle            ext4    99G  4.5G   89G   5% /oracle -> one instance for oracle database

/dev/mapper/vg2-lv2_orafmw            ext4    99G   60M   94G   1% /orafmw -> another for application in stance

Note-Two instance will run one for oracle database and another for application.


Thanks
pradipta

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
陈楼 | 12 Nov 09:20 2014

GFS2: fsid=MyCluster:gfs.1: fatal: invalid metadata block

hi ,guys
I have a two-nodes GFS2 cluster based on  logic volume created by drbd block device /dev/drbd0. The two nodes' mount points of  GFS2 filesystem are exported by samba share. Then there are two clients mounting and copying data into them respectively. Hours later, one client(assume just call it clientA) has finished all tasks, while the other client(assume just call it clientB) is still copying with very slow write speed(2-3MB/s, in normal case 40-100MB/s). 
Then I doubt that the there is something wrong with gfs2 filesystem on the corresponding server node that clientB mount to, and I try to write some data into it by 
excute commad as follows:  
[root <at> dcs-229 ~]# dd if=/dev/zero of=./data2 bs=128k count=1000
1000+0 records in
1000+0 records out
131072000 bytes (131 MB) copied, 183.152 s, 716 kB/s
It shows the write speed is too slow,  almostly hangs up. I redo it once again, it hangs up. Then, I terminate it with 『Ctr + c』, and kernel reports error messages as
follows:
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: fatal: invalid metadata block
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1:   bh = 25 (magic number)
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1:   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 393
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: jid=0: Trying to acquire journal lock...
Nov 12 11:50:11 dcs-229 kernel: Pid: 12044, comm: glock_workqueue Not tainted 2.6.32-358.el6.x86_64 #1
Nov 12 11:50:11 dcs-229 kernel: Call Trace:
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa044be22>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa044bf75>] ? gfs2_meta_check_ii+0x45/0x50 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04367d9>] ? gfs2_meta_indirect_buffer+0xf9/0x100 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa0431505>] ? gfs2_inode_refresh+0x25/0x2c0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa0430b48>] ? inode_go_lock+0x88/0xf0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa042f25b>] ? do_promote+0x1bb/0x330 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa042f548>] ? finish_xmote+0x178/0x410 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04303e3>] ? glock_work_func+0x133/0x1d0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04302b0>] ? glock_work_func+0x0/0x1d0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81090ac0>] ? worker_thread+0x170/0x2a0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81090950>] ? worker_thread+0x0/0x2a0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096916>] ? kthread+0x96/0xa0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096880>] ? kthread+0x0/0xa0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: jid=0: Failed
And the other node also reports error messages:
Nov 12 11:48:50 dcs-226 kernel: Pid: 13784, comm: glock_workqueue Not tainted 2.6.32-358.el6.x86_64 #1
Nov 12 11:48:50 dcs-226 kernel: Call Trace:
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa0478e22>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa0478f75>] ? gfs2_meta_check_ii+0x45/0x50 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa04637d9>] ? gfs2_meta_indirect_buffer+0xf9/0x100 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa045e505>] ? gfs2_inode_refresh+0x25/0x2c0 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa045db48>] ? inode_go_lock+0x88/0xf0 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: fatal: invalid metadata block
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0:   bh = 66213 (magic number)
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0:   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 393
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: about to withdraw this file system
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: telling LM to unmount
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045c25b>] ? do_promote+0x1bb/0x330 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045c548>] ? finish_xmote+0x178/0x410 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045d3e3>] ? glock_work_func+0x133/0x1d0 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045d2b0>] ? glock_work_func+0x0/0x1d0 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81090ac0>] ? worker_thread+0x170/0x2a0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81090950>] ? worker_thread+0x0/0x2a0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096916>] ? kthread+0x96/0xa0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096880>] ? kthread+0x0/0xa0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
After this, mount points has crashed. what should i do? Anyone could help me?


--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
Marek "marx" Grac | 30 Oct 16:40 2014
Picon

Building upstream fence agents on RHEL/CentOS 6

Hi,

After small investigation on RHEL6.6 and fence agents from upstream 
(latest git).

Summary: Yes, it should work.

Details:
* it is required to fix auto* stuff as Alan found
     fix - it will be in next release very likely
         change ACLOCAL_AMFLAGS from -I m4 to -I make
         change AC_CONFIG_MACRO-DIR from m4 to make

*  a) fence_vmware_soap requires package python-requests (+deps) 
available only in EPEL
     b) ignore fence_vmware_soap
         fix) from configure.ac remove AC_PYTHON_MODULE(requests, 1)

* in lib/fencing.py.py replace 'stream=sys.stderr' with 'sys.stderr' 
(one occurency)

* standard ./autogen.sh; ./configure; make

m,

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Lax Kota (lkota | 29 Oct 22:38 2014
Picon

daemon cpg_join error retrying

Hi All,

 

In one of my setup, I keep getting getting 'gfs_controld[10744]: daemon cpg_join error  retrying'. I have a 2 Node setup with pacemaker and corosync.

 

Even after I force kill the pacemaker processes and reboot the server and bring the pacemaker back up, it keeps giving cpg_join error. Is  there any way to fix this issue?  

 

 

Thanks

Lax

 

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
Marek "marx" Grac | 29 Oct 09:37 2014
Picon

How we conform OCF in fence agents and what to do with it

Hi,

I took a look at OCF specification for resource agents from 
https://github.com/ClusterLabs/OCF-spec

I rewrote it from DTD to Relax NG (XML form) and attempts to modify it 
until it accept current resource agents. These changes are put for 
discussion and I will mark those that are important for fence agent with 
asterisk.

<resource-agent> is root element

1*) new actions required: on, off, reboot, monitor, list, metadata

2) "timeout" for service should be only optional?

3) I don't understand element "version" directly under <resource-agent> 
as it has attribute "version"

4) we have added directly elements "vendor-url" and "longdesc" under 
<resource-agent>. This is inconsistent with "shortdesc" that is 
attribute but long description really should not be an attribute.

5) we have added attribute "automatic" to <actions> (e.g. fence_scsi)

6) our parameters use only "shortdesc", so perhaphs "longdesc" can be 
optional

7*)  element <getopt> for parameters and how they can be called from 
command line (used for man pages generation)

8) add "required" attribute for each parameter

9) add "default" value for <content> element

10) make element <special> optional. what should be inside?

11) <resource-agent> does not have only londgdesc but also shortdesc 
(single-line)

m,

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Alan Evangelista | 23 Oct 15:49 2014
Picon

Problems building fence-agents from source

Hi.

I'm trying to build fence-agents from source (master branch) on CentOS 6.5.
I already installed the following rpm packages (dependencies): autoconf,
automake, gcc, libtool, nss, nss-devel. When I tried to run ./autogen.sh,
I got:

configure.ac:162: error: possibly undefined macro: AC_PYTHON_MODULE
       If this token and others are legitimate, please use m4_pattern_allow.
       See the Autoconf documentation.

I then run

$ autoreconf --install

and autogen worked. Then, I have a problem running ./configure:

./configure: line 18284: syntax error near unexpected token `suds,'
./configure: line 18284: `AC_PYTHON_MODULE(suds, 1)'

I never had this problem before with earlier fence-agents versions.
Am I missing something or is there an issue with upstream code?

RPM dependencies versions:
  autoconf-2.63-5.1.el6.noarch
  automake-1.11.1-4.el6.noarch
  libtool-2.2.6-15.5.el6.x86_64

Regards,
Alan Evangelista

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Sunhux G | 22 Oct 10:44 2014
Picon

Rhel BootLoader, Single-user mode password & Interactive Boot in a Cloud environment

We run cloud service & our vCenter is not accessible to our tenants
and their IT support; so I would say console access is not feasible
unless the tenant/customer IT come to our DC.

If the following 3 hardenings are done our tenant/customer RHEL
Linux VM,  what's the impact to the tenant's sysadmin & IT operation?


a) CIS 1.5.3 Set Boot Loader Password :
    if this password is set, when tenant reboot (shutdown -r)
    their VM each time, will it prompt for the bootloader
    password at console?  If so, is there any way the tenant,
    could still get their VM booted up if they have no access
    to vCenter's console?

b) CIS 1.5.4 Require Authentication for Single-User Mode :
    Does Linux allow ssh access while in single-user mode &
    can this 'single-user mode password' be entered via an
    ssh session (without access to console), assuming certain
    'terminal' service is started up / running while in single
    user mode

c) CIS 1.5.5 Disable Interactive Boot :
    what's the general consensus on this? Disable or enable?
    Our corporate hardening guide does not mention this item.
    So if the tenant wishes to boot up step by step (ie pausing
    at each startup script), they can't do it?

Feel free to add any other impacts that anyone can think of

Lastly, how do people out there grant console access to their
tenants in Cloud environment without security compromise
(I mean without granting vCenter access) : I heard that we can
customize vCenter to grant limited access of vCenter to 
tenants, is this so?


Sun

--

-- 
Linux-cluster mailing list
Linux-cluster <at> redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Gmane