Sahil Tikale | 27 Apr 2012 11:44
Picon

Yume script broken on admin node of cluster

Hi I am running a ICE cluster which uses OSCAR.

I need to update various nodes and the update is failing for the lack of updated meta data information for the RPMS.

I tried running yume --prepare --repos which usually refreshed the metadata information.

But now it just gives out this 'dumb' output


***********************************************************************************************

pepper:/var/lock/systemimager # yume
Unexpected error detected (255) while distro-query. Exiting.

*****************************************************************************************************************************

I did some debugging using the perl debugger and reading through the perl script.

I could isolate the problem code snippet. Now I am new to both perl and to Cluster on Linux.
So correct me if you guys find anything wrong.

so the code snippet is here
*****************************************************************************************************************************
449 sub get_default_repos {
450     # if OSCAR_HOME is defined, we're on an OSCAR cluster
451     if (&on_oscar_master()) {
452         my $dquery = "$ENV{OSCAR_HOME}/scripts/distro-query";
453         if (-x $dquery) {
454             if ($installroot && (-x "$installroot/bin/bash")) {
455                 $dquery = $dquery . " --image $installroot";
456             }
457             print STDERR "Executing: $dquery\n" if ($verbose);
458             local *CMD;
459             open CMD, "$dquery |" or die "Could not run $dquery: $!";
460             while (<CMD>) {
461                 chomp;
462                 if (/Distro package url : (\S+)$/) {
463                     push <at> repos, split(",",$1);
464                 } elsif (/OSCAR package pool : (\S+)$/) {
465                     push <at> repos, split(",",$1);
466                 }
467             }

I think that highlighted IF condition is failing

*****************************************************************************************************************************

I could not establish the value of $installroot and $dquery. When I tried to print the value out, Perl complained that they are not initialized yet.
I checked the syntax of the script and it looks correct
I ran perl -c yume and it came out fine.

so I am stuck and need to get this script working. I believe the problem is with the call it is trying to make.

In short I want to get
yume --prepare --repos working.

With Regards,
Sahil Tikale



--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel@...
https://lists.sourceforge.net/lists/listinfo/oscar-devel
laxman Singh Rathore | 27 Apr 2012 17:35
Picon

Re: Yume script broken on admin node of cluster

Hello,

First of all let me know which distribution you are using for Head Node. Also attached the log file from the log location to provide resolution for the error.

 /var/log/oscar/ log <log file name>


Thanks

Laxman Singh

On Fri, Apr 27, 2012 at 3:14 PM, Sahil Tikale <sahil.tikale-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hi I am running a ICE cluster which uses OSCAR.

I need to update various nodes and the update is failing for the lack of updated meta data information for the RPMS.

I tried running yume --prepare --repos which usually refreshed the metadata information.

But now it just gives out this 'dumb' output


***********************************************************************************************

pepper:/var/lock/systemimager # yume
Unexpected error detected (255) while distro-query. Exiting.

*****************************************************************************************************************************

I did some debugging using the perl debugger and reading through the perl script.

I could isolate the problem code snippet. Now I am new to both perl and to Cluster on Linux.
So correct me if you guys find anything wrong.

so the code snippet is here
*****************************************************************************************************************************
449 sub get_default_repos {
450     # if OSCAR_HOME is defined, we're on an OSCAR cluster
451     if (&on_oscar_master()) {
452         my $dquery = "$ENV{OSCAR_HOME}/scripts/distro-query";
453         if (-x $dquery) {
454             if ($installroot && (-x "$installroot/bin/bash")) {
455                 $dquery = $dquery . " --image $installroot";
456             }
457             print STDERR "Executing: $dquery\n" if ($verbose);
458             local *CMD;
459             open CMD, "$dquery |" or die "Could not run $dquery: $!";
460             while (<CMD>) {
461                 chomp;
462                 if (/Distro package url : (\S+)$/) {
463                     push <at> repos, split(",",$1);
464                 } elsif (/OSCAR package pool : (\S+)$/) {
465                     push <at> repos, split(",",$1);
466                 }
467             }

I think that highlighted IF condition is failing

*****************************************************************************************************************************

I could not establish the value of $installroot and $dquery. When I tried to print the value out, Perl complained that they are not initialized yet.
I checked the syntax of the script and it looks correct
I ran perl -c yume and it came out fine.

so I am stuck and need to get this script working. I believe the problem is with the call it is trying to make.

In short I want to get
yume --prepare --repos working.

With Regards,
Sahil Tikale



--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan





------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel-5NWGOfrQmnd4wTydcyPnfg@public.gmane.orgceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-devel


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel@...
https://lists.sourceforge.net/lists/listinfo/oscar-devel
Sahil Tikale | 27 Apr 2012 19:50
Picon

Re: Yume script broken on admin node of cluster

Hi,

for Head node (and all other nodes, service and compute nodes) I am using SLES 11 SP 1
There is neither any log file available for oscar nor it shows any messages in /var/log/messages

All I get is the output I mentioned in the last email

******************************************************************************
#yume
Unexpected error detected (255) while distro-query. Exiting.
******************************************************************************

If I use argument --verbose with the command all I get is

******************************************************************************
Executing: ssh oscar_server bash -l -c \"\$OSCAR_HOME/scripts/distro-query --node pepper\"
Unexpected error detected (255) while distro-query. Exiting.

******************************************************************************

When I run the command yume without any arguments with the perl debugger
******************************************************************************

 DB<2> n
main::(yume-forstudy:84):        &get_default_repos();
  DB<2> n
Unexpected error detected (255) while distro-query. Exiting.
Debugged program terminated.  Use q to quit or R to restart,
  use o inhibit_exit to avoid stopping after program termination,
  h q, h R or h o to get additional info. 
  DB<2>

******************************************************************************

I am really stuck with the upgrade need to fix this issue asap.


--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan






On Fri, Apr 27, 2012 at 11:35 PM, laxman Singh Rathore <laxmansinghm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hello,

First of all let me know which distribution you are using for Head Node. Also attached the log file from the log location to provide resolution for the error.

 /var/log/oscar/ log <log file name>


Thanks

Laxman Singh

On Fri, Apr 27, 2012 at 3:14 PM, Sahil Tikale <sahil.tikale-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hi I am running a ICE cluster which uses OSCAR.

I need to update various nodes and the update is failing for the lack of updated meta data information for the RPMS.

I tried running yume --prepare --repos which usually refreshed the metadata information.

But now it just gives out this 'dumb' output


***********************************************************************************************

pepper:/var/lock/systemimager # yume
Unexpected error detected (255) while distro-query. Exiting.

*****************************************************************************************************************************

I did some debugging using the perl debugger and reading through the perl script.

I could isolate the problem code snippet. Now I am new to both perl and to Cluster on Linux.
So correct me if you guys find anything wrong.

so the code snippet is here
*****************************************************************************************************************************
449 sub get_default_repos {
450     # if OSCAR_HOME is defined, we're on an OSCAR cluster
451     if (&on_oscar_master()) {
452         my $dquery = "$ENV{OSCAR_HOME}/scripts/distro-query";
453         if (-x $dquery) {
454             if ($installroot && (-x "$installroot/bin/bash")) {
455                 $dquery = $dquery . " --image $installroot";
456             }
457             print STDERR "Executing: $dquery\n" if ($verbose);
458             local *CMD;
459             open CMD, "$dquery |" or die "Could not run $dquery: $!";
460             while (<CMD>) {
461                 chomp;
462                 if (/Distro package url : (\S+)$/) {
463                     push <at> repos, split(",",$1);
464                 } elsif (/OSCAR package pool : (\S+)$/) {
465                     push <at> repos, split(",",$1);
466                 }
467             }

I think that highlighted IF condition is failing

*****************************************************************************************************************************

I could not establish the value of $installroot and $dquery. When I tried to print the value out, Perl complained that they are not initialized yet.
I checked the syntax of the script and it looks correct
I ran perl -c yume and it came out fine.

so I am stuck and need to get this script working. I believe the problem is with the call it is trying to make.

In short I want to get
yume --prepare --repos working.

With Regards,
Sahil Tikale



--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan





------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/oscar-devel



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel-5NWGOfrQmnd4wTydcyPnfg@public.gmane.orgceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-devel


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel@...
https://lists.sourceforge.net/lists/listinfo/oscar-devel
Sahil Tikale | 29 Apr 2012 20:05
Picon

Re: Yume script broken on admin node of cluster

Hi all,

So I figured out the problem 99 %

The script is failing because of the following line of code

my $cmd = "ssh oscar_server bash -l -c \\\"\\\$OSCAR_HOME/scripts/distro-query --node $node\\\"";

equivalent command line is

ssh oscar_server bash -l -c $OSCAR_HOME/scripts/distro-query --node <admin node hostname>

in place <admin node hostname> should be the hostname of the admin node
in place $OSCAR_HOME it should be the home dir where the oscar server is installed.

When I run this from command line it provides the correct output.
I hard coded the line into the script instead of the line above and it worked.

so I guess the only issue left to be resolved is: " What is wrong with that particular line of code "

I am new to perl so might not be able to resolve it fast.
Need your help guys.


--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan






On Sun, Apr 29, 2012 at 10:56 PM, Sahil Tikale <sahil.tikale-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hi Laxman,

I am glad you replied. So here is more info on the background

The cluster in question is the ICE cluster from SGI.

I had made a clean new install on all the nodes last week. All the nodes have distros SLES.
I configured the Subscription Management Tool to download the updates from Novell.

There is a script bundled by SGI called sync-repo-updates. As you mentioned it downloads the required updates to a local location on the admin node.
That script downloads the updates from both SGI and NOVELL.
To download from NOVELL it uses the SMT in the background.
After the download is complete the script 'sync-repo-update' also has the provision to update the meta data.
To update the meta data it actually invokes 'yume' to do it.

Now I would not know when exactly yume stopped working as intended so I am no sure that the script could call yume or not and whether all the metadata was made up to date or not.

Up until the installation script YUME was working fine.
I realized it is broken when I attempted to upgrade the admin node using the command

cinstallman --update --node admin

I some how got through the upgrade process. Towards the end of the upgrade process, it was throwing lot of errors related to 'rpmdb: cannot write to the datbase errors"
I checked for the integrity of the database and all looks fine. the Rpm database is not corrupted.

Problems occured while trying to update the lead node:

***********************************************************************************************************************************************************

--> Running transaction check
--> Processing Dependency: dhcp-server <= 3.1.3.ESV-0.3.38 for package: sgi-lead-node
---> Package kernel-xen-base.x86_64 0:2.6.32.54-0.3.1 set to be updated
--> Finished Dependency Resolution
sgi-lead-node-2.5-sgi705rp11.sles11.noarch from installed has depsolving problems
  --> Missing Dependency: dhcp-server <= 3.1.3.ESV-0.3.38 is needed by package sgi-lead-node-2.5-sgi705rp11.sles11.noarch (installed)
Error: Missing Dependency: dhcp-server <= 3.1.3.ESV-0.3.38 is needed by package sgi-lead-node-2.5-sgi705rp11.sles11.noarch (installed)
yume exited with an error for node r1lead.
pepper:~ # cd /tftpboot/distro/sles11sp1/
pepper:/tftpboot/distro/sles11sp1 # ls |grep dhcp-server
dhcp-server-3.1.3.ESV-0.11.1.x86_64.rpm
dhcp-server-3.1.3.ESV-0.13.1.x86_64.rpm
dhcp-server-3.1.3.ESV-0.15.1.x86_64.rpm
dhcp-server-3.1.3.ESV-0.3.38.x86_64.rpm
dhcp-server-3.1.3.ESV-0.9.1.x86_64.rpm
yast2-dhcp-server-2.17.3-1.48.noarch.rpm


***********************************************************************************************************************************************************

When I try to update the r1lead node it exits stating Missing Dependency (show above)
The strange thing is
I looked into tftpboot/distro/sles11sp1/ and it does have the required rpm.
All the versions are listed.

Still it is not able to find it, now that is weird isnt it ?

Another observation. The same YUME script when I copy to the login node and try to run it behaves perfectly as expected.
That lead me to deduce that some thing that this script is trying to call is broken.
The 'distro-query' script it is trying to call, I called it manually and it works fine.
Some thing is going wrong while trying to detect repositories registered under admin node (by Crepo)

I hope all the info above will help





--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan






On Sun, Apr 29, 2012 at 9:29 PM, laxman Singh Rathore <laxmansinghm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hello,

In situations where you have software distributions (distros) present that do not match the distro installed on the admin node, you have to arrange to download the updates on your own.

you need to first set up Novell SMT server somewhere on your network for the admin node than download update on server.Once the RPMs are staged on that server, you can copy them to the admin node using rsync or some other similar transport method.

Remember to update the repository metadata after you update the packages. For example:

# yume --prepare --repo /tftpboot/distro/sles11sp1

Thanks and Regards
Laxman Singh
Linux Administrator


On Fri, Apr 27, 2012 at 11:20 PM, Sahil Tikale <sahil.tikale-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hi,

for Head node (and all other nodes, service and compute nodes) I am using SLES 11 SP 1
There is neither any log file available for oscar nor it shows any messages in /var/log/messages

All I get is the output I mentioned in the last email

******************************************************************************
#yume

Unexpected error detected (255) while distro-query. Exiting.
******************************************************************************

If I use argument --verbose with the command all I get is

******************************************************************************
Executing: ssh oscar_server bash -l -c \"\$OSCAR_HOME/scripts/distro-query --node pepper\"

Unexpected error detected (255) while distro-query. Exiting.

******************************************************************************

When I run the command yume without any arguments with the perl debugger
******************************************************************************

 DB<2> n
main::(yume-forstudy:84):        &get_default_repos();
  DB<2> n

Unexpected error detected (255) while distro-query. Exiting.
Debugged program terminated.  Use q to quit or R to restart,
  use o inhibit_exit to avoid stopping after program termination,
  h q, h R or h o to get additional info. 
  DB<2>

******************************************************************************

I am really stuck with the upgrade need to fix this issue asap.


--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan






On Fri, Apr 27, 2012 at 11:35 PM, laxman Singh Rathore <laxmansinghm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Hello,

First of all let me know which distribution you are using for Head Node. Also attached the log file from the log location to provide resolution for the error.

 /var/log/oscar/ log <log file name>


Thanks

Laxman Singh

On Fri, Apr 27, 2012 at 3:14 PM, Sahil Tikale <sahil.tikale <at> gmail.com> wrote:
Hi I am running a ICE cluster which uses OSCAR.

I need to update various nodes and the update is failing for the lack of updated meta data information for the RPMS.

I tried running yume --prepare --repos which usually refreshed the metadata information.

But now it just gives out this 'dumb' output


***********************************************************************************************

pepper:/var/lock/systemimager # yume
Unexpected error detected (255) while distro-query. Exiting.

*****************************************************************************************************************************

I did some debugging using the perl debugger and reading through the perl script.

I could isolate the problem code snippet. Now I am new to both perl and to Cluster on Linux.
So correct me if you guys find anything wrong.

so the code snippet is here
*****************************************************************************************************************************
449 sub get_default_repos {
450     # if OSCAR_HOME is defined, we're on an OSCAR cluster
451     if (&on_oscar_master()) {
452         my $dquery = "$ENV{OSCAR_HOME}/scripts/distro-query";
453         if (-x $dquery) {
454             if ($installroot && (-x "$installroot/bin/bash")) {
455                 $dquery = $dquery . " --image $installroot";
456             }
457             print STDERR "Executing: $dquery\n" if ($verbose);
458             local *CMD;
459             open CMD, "$dquery |" or die "Could not run $dquery: $!";
460             while (<CMD>) {
461                 chomp;
462                 if (/Distro package url : (\S+)$/) {
463                     push <at> repos, split(",",$1);
464                 } elsif (/OSCAR package pool : (\S+)$/) {
465                     push <at> repos, split(",",$1);
466                 }
467             }

I think that highlighted IF condition is failing

*****************************************************************************************************************************

I could not establish the value of $installroot and $dquery. When I tried to print the value out, Perl complained that they are not initialized yet.
I checked the syntax of the script and it looks correct
I ran perl -c yume and it came out fine.

so I am stuck and need to get this script working. I believe the problem is with the call it is trying to make.

In short I want to get
yume --prepare --repos working.

With Regards,
Sahil Tikale



--SAHIL TIKALE
Systems Consultant
“You are as old as your doubt, your fear, your despair.
The way to keep young is to keep your faith young; keep your hope young." 
                                                                                       --
Dr. L.F. Phelan





------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/oscar-devel



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/oscar-devel





------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Oscar-devel mailing list
Oscar-devel@...
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Gmane