Joe Acquisto | 1 Jul 2010 17:07
Picon

Cluster services, OES1 Linux

Doing red-carpet updates on clustered machines.  All has gone well, until . . .
When moving resources, in this case GroupWise PO's, to the updated
node, they fail to load.  Volumes seem to mount, secondary IP gets
bound, but, no poa is seen, or is running.   Worked on the previous
clusters that were updated.

In an attempt to get a grip on things, attempted to, on a working,
test, updated, box, to load the resources manually, by entering the
cluster startup scripts manually, as seen in /var/opt/novell/ncs/.
This does not work, as anticipated.  The evms, mount, ip stuff all
works, from command line, but poa will not load.

/var/log/messages tells me 'not loading in cluster services'

If I comment out the last line (poa load) from the cluster script
(iManager) and offline and online the resource, then manually load the
poa, it works.

I am puzzled.

joe a.
Joe Acquisto | 1 Jul 2010 20:00
Picon

Re: Cluster services, OES1 Linux

This may be of no value to anyone, but, it appears this issue has to
do with some custom load
scripts, which need caressing.  The one who wrote the particular one I
have isolated it to, is
no longer here.  Oh, the joys of uncommented scripts.

joe a.

On Thu, Jul 1, 2010 at 11:07 AM, Joe Acquisto <joe.acquisto <at> gmail.com> wrote:
> Doing red-carpet updates on clustered machines.  All has gone well, until . . .
> When moving resources, in this case GroupWise PO's, to the updated
> node, they fail to load.  Volumes seem to mount, secondary IP gets
> bound, but, no poa is seen, or is running.   Worked on the previous
> clusters that were updated.
>
> In an attempt to get a grip on things, attempted to, on a working,
> test, updated, box, to load the resources manually, by entering the
> cluster startup scripts manually, as seen in /var/opt/novell/ncs/.
> This does not work, as anticipated.  The evms, mount, ip stuff all
> works, from command line, but poa will not load.
>
> /var/log/messages tells me 'not loading in cluster services'
>
> If I comment out the last line (poa load) from the cluster script
> (iManager) and offline and online the resource, then manually load the
> poa, it works.
>
> I am puzzled.
>
> joe a.
(Continue reading)

Joe Acquisto | 6 Jul 2010 16:08
Picon

Cluster services, OES1 Linux - scripts

Anyone knowing cluster services - I'd like to go over some basics -

While cluster resource online and offline would seem to be only bash
scripts (in linux), I find the results are different, if run manually,
as a script, or line by line, that if run via the cluster command.

While this may be a "well, duh" moment, illumination may be in order.

For, in fact,  I have placed such things as "echo" and "logger" in
bits of script that normally run (but are not running normally).  I
see these little tags when running manually, but not when via cluster
command.   I tend to think I should see them in either case.

joe a.
Joe Acquisto | 7 Jul 2010 15:34
Picon

Re: Cluster services, OES1 Linux - scripts

OK, so some of the cluster stuff is python, apparently.  What of that.

One sticking point, for me, is that setting debug (set -x) on shell
(/bin/bash) scripts, works
ok when a script is called from command line, or other bash scripts,
but not when called
from cluster command.  Can only assume the output is being redirected.

How to find where it is going, or how to direct it where I can read it?

joe a.

On Tue, Jul 6, 2010 at 10:08 AM, Joe Acquisto <joe.acquisto <at> gmail.com> wrote:
> Anyone knowing cluster services - I'd like to go over some basics -
>
> While cluster resource online and offline would seem to be only bash
> scripts (in linux), I find the results are different, if run manually,
> as a script, or line by line, that if run via the cluster command.
>
> While this may be a "well, duh" moment, illumination may be in order.
>
> For, in fact,  I have placed such things as "echo" and "logger" in
> bits of script that normally run (but are not running normally).  I
> see these little tags when running manually, but not when via cluster
> command.   I tend to think I should see them in either case.
>
> joe a.
>
James Taylor | 7 Jul 2010 15:40

Re: Cluster services, OES1 Linux - scripts

Are there more than 255 characters in any of your cluster scripts?
I don't believe the cluster software reads more than 255 characters of a script.  If it is longer yo need to
reference commands in an external file to keep it below the maximum.
-jt

James Taylor
The East Cobb Group, Inc.
678-697-9420
james.taylor <at> eastcobbgroup.com
http://www.eastcobbgroup.com

>>> Joe Acquisto <joe.acquisto <at> gmail.com> 7/7/2010   08:34 AM >>> 
OK, so some of the cluster stuff is python, apparently.  What of that.

One sticking point, for me, is that setting debug (set -x) on shell
(/bin/bash) scripts, works
ok when a script is called from command line, or other bash scripts,
but not when called
from cluster command.  Can only assume the output is being redirected.

How to find where it is going, or how to direct it where I can read it?

joe a.

On Tue, Jul 6, 2010 at 10:08 AM, Joe Acquisto <joe.acquisto <at> gmail.com> wrote:
> Anyone knowing cluster services - I'd like to go over some basics -
>
> While cluster resource online and offline would seem to be only bash
> scripts (in linux), I find the results are different, if run manually,
> as a script, or line by line, that if run via the cluster command.
(Continue reading)

Tim Heywood | 7 Jul 2010 15:43
Picon

Re: Cluster services, OES1 Linux - scripts

The old NW problem was 924 chars within the load/unload script. With
OES1 SP2 that limit was removed (at the second attempt). 

T 
-----Original Message-----
From: "James Taylor" <James.Taylor <at> eastcobbgroup.com>
To: Novell LAN Interest Group <novell <at> netlab1.oucs.ox.ac.uk>

Sent: 07/07/2010 14:40:57
Subject: Re: Cluster services, OES1 Linux - scripts

Are there more than 255 characters in any of your cluster scripts?
I don't believe the cluster software reads more than 255 characters of a
script.  If it is longer yo need to reference commands in an external
file to keep it below the maximum.
-jt

James Taylor
The East Cobb Group, Inc.
678-697-9420
james.taylor <at> eastcobbgroup.com
http://www.eastcobbgroup.com

>>> Joe Acquisto <joe.acquisto <at> gmail.com> 7/7/2010   08:34 AM >>> 
OK, so some of the cluster stuff is python, apparently.  What of that.

One sticking point, for me, is that setting debug (set -x) on shell
(/bin/bash) scripts, works
ok when a script is called from command line, or other bash scripts,
but not when called
(Continue reading)

Joe Acquisto | 7 Jul 2010 15:45
Picon

Re: Cluster services, OES1 Linux - scripts

Do you mean the cluster scripts, as created in ConsoleOne and seen in
/var/opt/novell/ncs/?   Or the various things that might be called by
them?

joe a.

On Wed, Jul 7, 2010 at 9:40 AM, James Taylor
<James.Taylor <at> eastcobbgroup.com> wrote:
> Are there more than 255 characters in any of your cluster scripts?
> I don't believe the cluster software reads more than 255 characters of a script.  If it is longer yo need to
reference commands in an external file to keep it below the maximum.
> -jt
>
>
>
> James Taylor
> The East Cobb Group, Inc.
> 678-697-9420
> james.taylor <at> eastcobbgroup.com
> http://www.eastcobbgroup.com
>
>
>
>
>>>> Joe Acquisto <joe.acquisto <at> gmail.com> 7/7/2010   08:34 AM >>>
> OK, so some of the cluster stuff is python, apparently.  What of that.
>
> One sticking point, for me, is that setting debug (set -x) on shell
> (/bin/bash) scripts, works
> ok when a script is called from command line, or other bash scripts,
(Continue reading)

Joe Acquisto | 7 Jul 2010 15:53
Picon

Re: Cluster services, OES1 Linux - scripts

My original problem, of the resource not coming online (or even being
functional)
may have been due to an outdated "check" script (home brew) on the reluctant
cluster.   I have replaced it with the current version and will test
today.  I expect
it will perform as it does on all the other clusters.

The other "problem", where a resource does not come "up" when the load script
is run manually, remains.  That, I believe, is also due to this bit of
script, which is
why I attempted to exploit "set -x" in it.   (This bit, fyi, checks
the resource, node
and status, in an attempt to keep things orderly)

That's when I discovered there is no debug output (to screen) when using the
cluster software to online and offline the resources.   And went down the
rabbit hole, again.

joe a.

On Wed, Jul 7, 2010 at 9:43 AM, Tim Heywood <Tim <at> nds8.co.uk> wrote:
> The old NW problem was 924 chars within the load/unload script. With
> OES1 SP2 that limit was removed (at the second attempt).
>
> T
> -----Original Message-----
> From: "James Taylor" <James.Taylor <at> eastcobbgroup.com>
> To: Novell LAN Interest Group <novell <at> netlab1.oucs.ox.ac.uk>
>
> Sent: 07/07/2010 14:40:57
(Continue reading)

James Taylor | 7 Jul 2010 15:56

Re: Cluster services, OES1 Linux - scripts

The script itself has been the limitation in my experience. What it calls has no limits.
-jt 

>>> Joe Acquisto <joe.acquisto <at> gmail.com> 7/7/2010   08:45 AM >>> 
Do you mean the cluster scripts, as created in ConsoleOne and seen in
/var/opt/novell/ncs/?   Or the various things that might be called by
them?

joe a.

On Wed, Jul 7, 2010 at 9:40 AM, James Taylor
<James.Taylor <at> eastcobbgroup.com> wrote:
> Are there more than 255 characters in any of your cluster scripts?
> I don't believe the cluster software reads more than 255 characters of a script.  If it is longer yo need to
reference commands in an external file to keep it below the maximum.
> -jt
>
>
>
> James Taylor
> The East Cobb Group, Inc.
> 678-697-9420
> james.taylor <at> eastcobbgroup.com
> http://www.eastcobbgroup.com
>
>
>
>
>>>> Joe Acquisto <joe.acquisto <at> gmail.com> 7/7/2010   08:34 AM >>>
> OK, so some of the cluster stuff is python, apparently.  What of that.
(Continue reading)

Joe Acquisto | 8 Jul 2010 19:33
Picon

evms error message

When starting evmsgui, get messages for all disks "alternate CSM
header is missing or corrupt."  "Marking blah dirty  . . . ".   I have
tried saving when exiting evmsgui.  Still get the message starting
evmsgui.

This is on a two node oes1 linux cluster.  may have started after
doing a red-carpet update.  But other updated nodes do not do this.  I
did successfully migrate a non critical resource to the other node, so
whatever is going on, does not seem to be an immediate show stopper,
but it is troubling.  Ahem,

Discovered this when looking for the reasons for this in /var/log/messages:

Jul  8 13:30:13 mynode multipathd: eva_blah: failed in domap for
addition of new path sdw
Jul  8 13:30:14 mynode kernel: device-mapper: dm-multipath: unknown
path selector type
Jul  8 13:30:14 mynode kernel: device-mapper: error adding target to table

These messages started the day of the rc updates.

joe a.

Gmane