Neil Casey | 1 Nov 2006 02:08
Picon
Favicon

Re: Cluster question;


Hi Doug,

Suppose there are 4 queue managers in a cluster (QM1, QM2, QM3 and QM4), and all 4 host a cluster queue (say CLUS.Q1).

If an application running on one of those 4 queue managers (say QM1) issues MQPUTs or MQPUT1s to queue CLUS.Q1, then all of the messages will show up on the QM1 instance of CLUS.Q1. There is no load balancing across the cluster.

At version 6, there is a new queue attribute (CLWLUSEQ) which allows you to specify that the local instance should be treated just like any other instance in the cluster, but at version 5, you would have to use a CLWLEXIT to get that behaviour. With CLWLUSEQ(ANY) in version 6 the messages put by the app on QM1 to CLUS.Q1 could show up on any of the 4 instances, and with DEFBIND(NOTFIXED) they would be load balanced across them.

If your PUTs are originating on a server which doesn't have a copy of the queue then it doesn't make any difference.

Regards,
Neil Casey
Lead Technical Specialist, Middleware MQ Support
Technology Operations
National Australia Bank

Level 1, 122 Lewis Rd  (Tue Thu)
Tel: +61 (0) 3 9886 2375  |  Fax: +61 (0) 3 9886 2700  |  Mob: +61 (0) 408 356 208
Pier 4, Level 8, 800 Bourke St  (Mon Wed Fri)
Tel: +61 (0) 3 8634 2205  |  Fax: +61 (0) 3 8634 3788  |  Mob: +61 (0) 408 356 208
Email: Neil.Casey-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org




"Clark, Douglas" <dclark-43w+wAGA5HO7pU93UQLDAHzNABE0Ld/Y@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

01/11/2006 08:15

Please respond to
MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
Re: Cluster question;





Neil,
 
Thank you for your reply.  Can you please clarify one point you made about "messages will not leave a queue manager which hosts a target queue unless you are running version 6." I am sorry but I do not understand what you mean.  As an aside, all of these machines will be running MQSeries 5.3 CSD 10.  
 
Doug

From: MQSeries List [mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org] On Behalf Of Neil Casey
Sent: Tuesday, October 31, 2006 1:05 PM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
Subject: Re: Cluster question;


Hi Douglas,

I would recommend that the new machines are defined as partial repositories.

The manuals show that more than 2 full repositories are supported, and it always sounds like a good idea. However, in my experience the actual implementation of repository update strategies by the MQ software means that having more than 2 full repositories doesn't really help, and can make MQ behave in unexpected ways.

The core of the issue is that when a queue manager sends a repository update or inquiry, it sends it to exactly 2 full repositories. It will always send it to the FR for which there is a static sender channel defined, and it will send to 1 other, determined in some way by the software.

The problem comes when FRs are down (either planned or unplanned). One might expect (I certainly did) that as long as at least 1 FR is up, your cluster should be healthy. Unfortunately, the repos update strategy means that if you have say 3 FRs, and the 2 which your PR sends updates to are down, then no updates or requests from that PR can be handled. This leads to things like issuing a suspend command, which gets processed, but the FR and the rest of the cluster don't get to see it unless one of the 2 update FRs are up. Messages keep arriving at the 'suspended' queue manager.


So, the bottom line is that no matter how many FRs are in your cluster, the cluster is not healthy unless a maximum of 1 FR is unavailable. Statistical analysis shows that the more FRs there are, the more likely it is that more than one will be unavailable, and therefore the more likely it is that your cluster could have a problem.

So my belief is that 2 FRs is a special number which is always the right number to use.

FRs and PR which host application queues participate equally in load balancing, although messages will not leave a queue manager which hosts a target queue unless you are running version 6, where there is an option available to load balance away from the current machine. At v5, if one of the target queues in the cluster is on the current queue manager, the messages will always go there.

My last piece of advice on clusters is that if you have a requirement for timely performance in a request/response model, with messages flowing via a cluster, don't put the FRs on a queue manager which has application queues. Use dedicated queue managers (even if you run them on the same machines). I am chasing some performance issues at my site where messages get delayed when the FR amqrrmfa process gets busy managing  cluster stuff, and it stops responding to application requests in a timely manner.

Regards,

Neil Casey
Lead Technical Specialist, Middleware MQ Support
Technology Operations
National Australia Bank

Level 1, 122 Lewis Rd  (Tue Thu)
Tel: +61 (0) 3 9886 2375  |  Fax: +61 (0) 3 9886 2700  |  Mob: +61 (0) 408 356 208
Pier 4, Level 8, 800 Bourke St  (Mon Wed Fri)
Tel: +61 (0) 3 8634 2205  |  Fax: +61 (0) 3 8634 3788  |  Mob: +61 (0) 408 356 208
Email: Neil.Casey-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org




"Clark, Douglas" <dclark-43w+wAGA5HO7pU93UQLDAHzNABE0Ld/Y@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

01/11/2006 07:08

Please respond to
MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>


To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
Re: Cluster question;







Due to significant increase in the number of messages we will process
for Christmas I have to prepare for the addition of machines to my
current environment.

Currently, I have four client machines and four queue managers (WC1,
WC2, WC3, WC4) sending data to a clustered queue (PROD_Q1) defined on
two server machines with queue managers WS1 and WS2 each hosting a full
repositories.

I am planning to add four more client machines and queue managers (WC5,
WC6, WC7, WC8) as well as two more server machines and queue managers
(WS3 and WS4).

The queue managers WS3 and WS4 will also have defined the same clustered
queue (PROD_Q1) and must be able to receive data from all eight client
machines (the original 4 and the 4 new servers recently added).

Should I make queue managers WS3 and WS4 full repositories or just
partial repositories.  Will the PROD_Q1 on the two new server machines
participate equally in the load balancing done for WS1 and WS2.

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
National Australia Bank Ltd - ABN 12 004 044 937
This email may contain confidential information. If you are not the intended recipient, please immediately notify us at postmaster-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org or by replying to the sender, and then destroy all copies of this email. Except where this email indicates otherwise, views expressed in this email are those of the sender and not of National Australia Bank Ltd. Advice in this email does not take account of your objectives, financial situation, or needs. It is important for you to consider these matters and, if the e-mail refers to a product(s), you should read the relevant Product Disclosure Statement(s)/other disclosure document(s) before making any decisions. If you do not want email marketing from us in future, forward this email with "unsubscribe" in the subject line to Unsubscriptions-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org in order to stop marketing emails from this sender. National Australia Bank Ltd does not represent that this email is free of errors, viruses or interference.



List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com



List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

National Australia Bank Ltd - ABN 12 004 044 937
This email may contain confidential information. If you are not the intended recipient, please immediately notify us at postmaster-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org or by replying to the sender, and then destroy all copies of this email. Except where this email indicates otherwise, views expressed in this email are those of the sender and not of National Australia Bank Ltd. Advice in this email does not take account of your objectives, financial situation, or needs. It is important for you to consider these matters and, if the e-mail refers to a product(s), you should read the relevant Product Disclosure Statement(s)/other disclosure document(s) before making any decisions. If you do not want email marketing from us in future, forward this email with "unsubscribe" in the subject line to Unsubscriptions-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org in order to stop marketing emails from this sender. National Australia Bank Ltd does not represent that this email is free of errors, viruses or interference.

List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Clark, Douglas | 1 Nov 2006 02:10
Favicon

Re: Cluster question;

Ok - my eyes have been opened.  Thanks for your time and explanation.

From: MQSeries List [mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org] On Behalf Of Neil Casey
Sent: Tuesday, October 31, 2006 5:08 PM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
Subject: Re: Cluster question;


Hi Doug,

Suppose there are 4 queue managers in a cluster (QM1, QM2, QM3 and QM4), and all 4 host a cluster queue (say CLUS.Q1).

If an application running on one of those 4 queue managers (say QM1) issues MQPUTs or MQPUT1s to queue CLUS.Q1, then all of the messages will show up on the QM1 instance of CLUS.Q1. There is no load balancing across the cluster.

At version 6, there is a new queue attribute (CLWLUSEQ) which allows you to specify that the local instance should be treated just like any other instance in the cluster, but at version 5, you would have to use a CLWLEXIT to get that behaviour. With CLWLUSEQ(ANY) in version 6 the messages put by the app on QM1 to CLUS.Q1 could show up on any of the 4 instances, and with DEFBIND(NOTFIXED) they would be load balanced across them.

If your PUTs are originating on a server which doesn't have a copy of the queue then it doesn't make any difference.

Regards,
Neil Casey
Lead Technical Specialist, Middleware MQ Support
Technology Operations
National Australia Bank

Level 1, 122 Lewis Rd  (Tue Thu)
Tel: +61 (0) 3 9886 2375  |  Fax: +61 (0) 3 9886 2700  |  Mob: +61 (0) 408 356 208
Pier 4, Level 8, 800 Bourke St  (Mon Wed Fri)
Tel: +61 (0) 3 8634 2205  |  Fax: +61 (0) 3 8634 3788  |  Mob: +61 (0) 408 356 208
Email: Neil.Casey-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org




"Clark, Douglas" <dclark-43w+wAGA5HO7pU93UQLDAHzNABE0Ld/Y@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

01/11/2006 08:15

Please respond to
MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
Re: Cluster question;





Neil,
 
Thank you for your reply.  Can you please clarify one point you made about "messages will not leave a queue manager which hosts a target queue unless you are running version 6." I am sorry but I do not understand what you mean.  As an aside, all of these machines will be running MQSeries 5.3 CSD 10.  
 
Doug

From: MQSeries List [mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org] On Behalf Of Neil Casey
Sent: Tuesday, October 31, 2006 1:05 PM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
Subject: Re: Cluster question;


Hi Douglas,

I would recommend that the new machines are defined as partial repositories.

The manuals show that more than 2 full repositories are supported, and it always sounds like a good idea. However, in my experience the actual implementation of repository update strategies by the MQ software means that having more than 2 full repositories doesn't really help, and can make MQ behave in unexpected ways.

The core of the issue is that when a queue manager sends a repository update or inquiry, it sends it to exactly 2 full repositories. It will always send it to the FR for which there is a static sender channel defined, and it will send to 1 other, determined in some way by the software.

The problem comes when FRs are down (either planned or unplanned). One might expect (I certainly did) that as long as at least 1 FR is up, your cluster should be healthy. Unfortunately, the repos update strategy means that if you have say 3 FRs, and the 2 which your PR sends updates to are down, then no updates or requests from that PR can be handled. This leads to things like issuing a suspend command, which gets processed, but the FR and the rest of the cluster don't get to see it unless one of the 2 update FRs are up. Messages keep arriving at the 'suspended' queue manager.


So, the bottom line is that no matter how many FRs are in your cluster, the cluster is not healthy unless a maximum of 1 FR is unavailable. Statistical analysis shows that the more FRs there are, the more likely it is that more than one will be unavailable, and therefore the more likely it is that your cluster could have a problem.

So my belief is that 2 FRs is a special number which is always the right number to use.

FRs and PR which host application queues participate equally in load balancing, although messages will not leave a queue manager which hosts a target queue unless you are running version 6, where there is an option available to load balance away from the current machine. At v5, if one of the target queues in the cluster is on the current queue manager, the messages will always go there.

My last piece of advice on clusters is that if you have a requirement for timely performance in a request/response model, with messages flowing via a cluster, don't put the FRs on a queue manager which has application queues. Use dedicated queue managers (even if you run them on the same machines). I am chasing some performance issues at my site where messages get delayed when the FR amqrrmfa process gets busy managing  cluster stuff, and it stops responding to application requests in a timely manner.

Regards,

Neil Casey
Lead Technical Specialist, Middleware MQ Support
Technology Operations
National Australia Bank

Level 1, 122 Lewis Rd  (Tue Thu)
Tel: +61 (0) 3 9886 2375  |  Fax: +61 (0) 3 9886 2700  |  Mob: +61 (0) 408 356 208
Pier 4, Level 8, 800 Bourke St  (Mon Wed Fri)
Tel: +61 (0) 3 8634 2205  |  Fax: +61 (0) 3 8634 3788  |  Mob: +61 (0) 408 356 208
Email: Neil.Casey-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org




"Clark, Douglas" <dclark-43w+wAGA5HO7pU93UQLDAHzNABE0Ld/Y@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

01/11/2006 07:08

Please respond to
MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>


To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
Re: Cluster question;







Due to significant increase in the number of messages we will process
for Christmas I have to prepare for the addition of machines to my
current environment.

Currently, I have four client machines and four queue managers (WC1,
WC2, WC3, WC4) sending data to a clustered queue (PROD_Q1) defined on
two server machines with queue managers WS1 and WS2 each hosting a full
repositories.

I am planning to add four more client machines and queue managers (WC5,
WC6, WC7, WC8) as well as two more server machines and queue managers
(WS3 and WS4).

The queue managers WS3 and WS4 will also have defined the same clustered
queue (PROD_Q1) and must be able to receive data from all eight client
machines (the original 4 and the 4 new servers recently added).

Should I make queue managers WS3 and WS4 full repositories or just
partial repositories.  Will the PROD_Q1 on the two new server machines
participate equally in the load balancing done for WS1 and WS2.

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html
National Australia Bank Ltd - ABN 12 004 044 937
This email may contain confidential information. If you are not the intended recipient, please immediately notify us at postmaster-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org or by replying to the sender, and then destroy all copies of this email. Except where this email indicates otherwise, views expressed in this email are those of the sender and not of National Australia Bank Ltd. Advice in this email does not take account of your objectives, financial situation, or needs. It is important for you to consider these matters and, if the e-mail refers to a product(s), you should read the relevant Product Disclosure Statement(s)/other disclosure document(s) before making any decisions. If you do not want email marketing from us in future, forward this email with "unsubscribe" in the subject line to Unsubscriptions-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org in order to stop marketing emails from this sender. National Australia Bank Ltd does not represent that this email is free of errors, viruses or interference.



List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com



List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

National Australia Bank Ltd - ABN 12 004 044 937
This email may contain confidential information. If you are not the intended recipient, please immediately notify us at postmaster-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org or by replying to the sender, and then destroy all copies of this email. Except where this email indicates otherwise, views expressed in this email are those of the sender and not of National Australia Bank Ltd. Advice in this email does not take account of your objectives, financial situation, or needs. It is important for you to consider these matters and, if the e-mail refers to a product(s), you should read the relevant Product Disclosure Statement(s)/other disclosure document(s) before making any decisions. If you do not want email marketing from us in future, forward this email with "unsubscribe" in the subject line to Unsubscriptions-yjaDgy+nFNoQrrorzV6ljw@public.gmane.org in order to stop marketing emails from this sender. National Australia Bank Ltd does not represent that this email is free of errors, viruses or interference.

List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Vaughan Phillips | 1 Nov 2006 09:24

Re: Help!

Todd, Jay, Emile, Roger, Scott and all,
Thank you for your help with this problem.
 
We are now running WMQ v6.0 with SLES v9.0 and everything is fine.
 
 
Cheers
Vaughan



-------------------------------------------------------
QAS Ltd.
Registered in England: No 2582055
Registered in Australia: No 082 851 474
-------------------------------------------------------

Disclaimer: The information contained within this e-mail is confidential and may be privileged. This email is intended solely for the named recipient only; if you are not authorised you must not disclose, copy, distribute, or retain this message or any part of it. If you have received this message in error please contact the sender at once so that we may take the appropriate action and avoid troubling you further. Any views expressed in this message are those of the individual sender. QAS Limited has the right lawfully to record, monitor and inspect messages between its employees and any third party. Your messages shall be subject to such lawful supervision as QAS Limited deems to be necessary in order to protect its information, its interests and its reputation.

Whilst all efforts are made to safeguard Inbound and Outbound emails, QAS Limited cannot guarantee that attachments are virus free or compatible with your systems and does not accept any liability in respect of viruses or computer problems experienced.


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Russell Finn | 1 Nov 2006 10:20
Picon
Favicon

Re: MQ v6 z/os remote administration of MQ v5.3 distributed


If the message is non-persistent and the z/OS queue manager has no dead letter queue defined and the channel is defined using NPMSPEED(FAST) and the target/reply queue on z/OS can not be put to (full, put-disabled, authority failure etc) then I think the message can't go anywhere and so is discarded.  A report message would be generated if requested in the MQMD.Report field though.

Might you be in this situation?  If so, configure a DEADQ on the z/OS QMgr and/or use NPMSPEED(NORMAL)

Russell

Russell Finn                    
MQSeries System Test    
russell_finn-ygUJEDcBm8rQT0dZR+AlfA@public.gmane.org




"Heggie, Peter" <Peter.Heggie-IPhzozbAnH1+cjeuK/JdrQ@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

31/10/2006 18:41

Please respond to
MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
MQ v6 z/os remote administration of MQ v5.3 distributed





 
I'm trying to use the z/os CSQUTIL batch utility to issue commands to a
remote queue manager. The remote qmgr is a Windows v5.3 qmgr. I ran into
some security problems and resolved them, to the point where I can issue
a Display Qlocal command to a remote qmgr without the request ending up
in the dead letter queue of the remote qmgr. But I never get the reply.

I can see that the channel status of the reply route channel increases
by one, indicating (I think) that a message is sent back to the (z/os)
requestor, but I don't see the message anywhere. There is nothing in the
z/os DLQ. There are no messages in the z/os started tasks and no
messages in the system (SDSF) log. The CSQUTIL batch job reports a
timeout after waiting the specified RESPTIME.

As I know that MQ does not lose messages, I am puzzled as to why the
channel status of the 'reply' channel shows that a message was sent, yet
there is no sign of that message.

A further clue is that, before I fixed the last security problem, a
message showed up on the remote qmgr DLQ with a 'NOT AUTHORIZED' error
on the (temporary) reply queue on z/os. This queue name is
SYSTEM.CSQUTIL.BFA2F16F88E9DF82. This queue did exist, for 30 seconds
(RESPTIME), on the z/os qmgr, while the CSQUTIL utility waited for a
reply. Perhaps I incorrectly read into this error that there was an
actual reply to the remote admin request, and it was on its way back to
z/os, but was stopped by security. I gave the userid of the request the
ability to Put to the transmit queue pointing back to the requestor
(z/os) and after that got no more messages on the remote DLQ.

I also noticed that this 'reply' message actually had no content in it,
aside from the original request. The was no SYSTEM.COMMAND.INPUT queue
on the Windows qmgr, so I created an alias by that name, pointing to the
SYSTEM.ADMIN.COMMAND.QUEUE. I can see that the Windows command server
process has this queue open exclusively. And the previously mentioned
'reply' message was sent by this command server.

My question to all is - what happened to the reply message?

Perhaps a better question is - is there a better way to display the
properties of a remote queue located on a distributed v5.3 qmgr, from a
z/os v6 qmgr, without writing a program for the distributed qmgr?

**** For your information: the Rhode Island Operations of New England Gas Company have been acquired by National Grid and are now doing business under that name.  ****


This e-mail and any files transmitted with it, are confidential to National Grid and are intended solely for the use of the individual or entity to whom they are addressed.  If you have received this e-mail in error, please reply to this message and let the sender know.

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Francois Van der Merwe1 | 1 Nov 2006 11:28
Picon
Favicon

Re: Cluster size

Bridgette
Thank you so much - will monitor for that as well.

Francois van der Merwe
Senior IT specialist, WMQ certified
IBM, Cape Town, South Africa
+27(0)21 402 5597 or +27(0)82 556 9467
Be an oceans Defender - oceans.greenpeace.org

                                                                           
             "Beardsley,                                                   
             Bridgette"                                                    
             <Bridgette.Beards                                          To 
             ley@...>              
MQSERIES@...  
             Sent by: MQSeries                                          cc 
             List                                                          
             <MQSERIES <at> LISTSER                                     Subject 
             V.MEDUNIWIEN.AC.A         Re: Cluster size                    
             T>                                                            

                                                                           
             31/10/2006 20:01                                              

                                                                           
             Please respond to                                             
               MQSeries List                                               
             <MQSERIES <at> LISTSER                                             
             V.MEDUNIWIEN.AC.A                                             
                    T>                                                     

Francois,

I've had 1200+ queue managers in a cluster, and it had it's +'s and -'s.
Overall it was good architecture that served its purpose.  We used two
separate servers for the full repositories, configured for failover.
Although we never had a production issue which forced a failover, we did
find that the full repositories were not in synch, in other words,
clustered queue managers and objects were in one full repos and not
another.  Cluster refreshes did not correct the issue and in an
environment that large, the refreshes were lengthy and heavily loaded
the cluster.  The issue turned out to be the cluster repository process,
amqrrmfa, had died on partial repository queue managers, disallowing a
response to the refresh cluster command.  So, my advise would be to stay
close to the latest patch (issue was at MQ 5.3), and proactively monitor
your environment.

2nd question.  Just try to plan for average and peak volumes, and
connection frequencies.  You'll probably find that you have to tune
these as you add in queue managers.

-Bridgette Beardsley

-----Original Message-----
From: MQSeries List
[mailto:MQSERIES@...] On
Behalf Of Francois Van der Merwe1
Sent: Tuesday, October 24, 2006 2:34 AM
To: MQSERIES@...
Subject: Cluster size

I plan to add 650 MQ servers to the same MQ cluster ..... is this good
or
not?

Anybody with some experience in big clusters?

Also, how do I calculate resources needed on the two PR queue managers.

Thanks

Francois van der Merwe
Senior IT specialist, WMQ certified
IBM, Cape Town, South Africa
+27(0)21 402 5597 or +27(0)82 556 9467
Be an oceans Defender - oceans.greenpeace.org

To unsubscribe, write to LISTSERV@... and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html

To unsubscribe, write to LISTSERV@... and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html

To unsubscribe, write to LISTSERV@... and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Heinz Klein | 1 Nov 2006 01:54
Picon

Re: CF Recovery in a QSG Environment

Paul.

Thank you very nuch for the explanation. However, I will grab the opportunity and, if you allow me, squeeze a little further:
1) Can you please ellaborate what you mean by "persistent messages over a certain age"?
2) Assuming the CF structure was completely wiped out (failure or something equivalent to a power-on-reset) does the people responsible for the CF need to do something (define lists and/or structures) before MQ recovers the queues or can the whole recovery process be left to MQ?
3) My understanding was that the recovery process of a failed CF would need to be initiated by a RECOVER CFSTRUCT command. Somehow I got the impression from your text that even without this command being issued MQ would do the recovery of the shared queues upon restart. Is this correct?

Thank you again.

Heinz

Paul S Dennis wrote:

Hubert, that is not quite right! The contents of the shared queues are rebuilt when you do a RECOVER CFSTRUCT.

When the BACKUP CFSTRUCT command is issued the contents of the CF (persistent messages over a certain age) are copied from the CF structure being backed up into the log of the qmgr doing the backup. At a later stage, if recovery is needed then a qmgr will read this backup, and then the logs from this point on of all the qmgrs in the QSG to rebuild the message content of the queues that were on the failed structure. You don't need all of the qmgrs to be started to do recovery, just the logs to be available and readable by the one queue manager that is doing recovery (hence why you need SHAREOPTION(2 3) on the active logs, they wouldn't be readable by another qmgr otherwise). If you deem that the amount of time that will be required to do a recovery of the CF structure is too great (you forgot to backup the structure on a regular basis!) it is possible to use the RECOVER CFSTRUCT TYPE(PURGE) command to make the queues/structure available, but they won't contain any messages. There is no standard way of recovering a CF structure to a local queue.

The original logic behind having a manual recovery of the structures is the assumption here that if you had a failure of one structure, you may well have the failure of multiple structures close together. The RECOVER CFSTRUCT command allows you to specify multiple structures to recover at the same time, it can then recover all of the structures with a single read of the logs, rather than having to read for each structure.

Thanks
Paul


Paul Dennis
WebSphere MQ for z/OS Development
IBM Hursley



Hubert Kleinmanns <Hubert.Kleinmanns-aEaqbU0zMOo@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

31/10/2006 07:07


To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc

Subject
Re: CF Recovery in a QSG Environment







Hello,

There are two MQSC commands BACKUP CFSTRUCT and RECOVER CFSTRUCT, which allow a manual rebuild of the CF structures. As far as I know, these commands rebuild only the structure, not the contents. Rebuilding the shared queues will be done by the members of the QSG. In difference to local queues the contents of shared queues are spread over the active logs of several QMgrs in the QSG. So you need not anly one but all QMgrs of a QSG, to rebuild shared queues.

Regards
Hubert


> -----Ursprüngliche Nachricht-----
> Von: hklein-eo0T4D/ZA8YIdKJ7tpkyPg@public.gmane.org
> Gesendet: 30.10.06 22:23:22
> An: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
> Betreff: CF Recovery in a QSG Environment


> Hello MQers.
>
> I have two questions regarding disaster recovery using QSG:
>
> 1) I was told that other products (DB2 was explicitly mentioned) rebuild
> the Coupling Facility lists and structures themselves if the CF is lost,
> without need of manual intervention. Does MQ do something similar?
> 2) Assuming a CF is really lost, is there any way to rebuild the shared
> queues as local queues from the MQ log?
>
> Thanks in advance for any help.
>
> Heinz Klein
> OLTP Tecnologia & Solucoes Ltda.
> Sao Paulo/SP - Brasil
>
> To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org and,
> in the message body (not the subject), write: SIGNOFF MQSERIES
> Instructions for managing your mailing list subscription are provided in
> the Listserv General Users Guide available at http://www.lsoft.com
> Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html

-- 
Hubert Kleinmanns
Beratung / Schulung / Projektleitung

Tel.: +49 (0) 60 78 / 7 12 21
Fax: +49 (0) 60 78 / 7 12 25
Mobil: +49 (0) 178 / 6 97 22 54

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.409 / Virus Database: 268.13.18/506 - Release Date: 30-Oct-06

List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Todd R Wyatt | 1 Nov 2006 13:16
Picon
Favicon

Re: Cluster question;

Neil,

First let me state that I always recommend no more then two full 
repositories unless there is a good business reason to have more such as 
having one or two on each of three continents.  So I agree with most of 
what you said below.  However I've worked with many clusters in which 
we've had more than one FR down and the cluster remained healthy.  The key 
is in defining one and only one explicit CLUSSDR channel to a repository 
and making sure the FRs are set up to properly distribute that 
information.

If the PR cluster members have only one explicit CLUSSDR they will attempt 
to publish cluster activity on that channel as you say.  Publications to 
the secondary FRs are based on cluster workload balancing so if you have 
more than one possible destination the routing is resolved dynamically. If 
two of three FRs are down, the third one should get the publication.  The 
most frequent problem I see in this regard is that there is more than one 
explicit CLUSSDR defined.  If there are three explicit CLUSSDR channels in 
the example above, the QMgr would publish on only two regardless of which 
FRs were available.  Not sure how it determines which two but for the sake 
of this example let's assume they are the two with the earliest definition 
date and time.  If the latest FR is available and the earliest two are 
not, the publications will queue up for the two downed FRs, assuming all 
have an explicit CLUSSDR.

The next step is to make sure that your repositories ALSO will publish 
successfully to one another.  Repository publications also prefer explicit 
CLUSSDR channels and also will send a duplicate publication.  The trick is 
to design a publication topology that insures all repositories are updated 
and again the most common mistake is to define too many explicit CLUSSDR 
channels.  Frequently I see  where a client has defined an explicit 
channel pair between every possible repository.  This locks each FR into a 
specific two publication destinations and if both are down the 
publications are queued up.  The preferred solution is to define only one 
explicit CLUSSDR on a FR to another FR and to join groups of three or more 
in a specific topology that routes messages around the cluster.  For 
example, a topology that works well with three FRs is to arrange them in a 
triangle with an explicit channel on each leg.  So FR1 points to FR2 
points to FR3 points to FR1.  In the example with two FRs on each 
continent, each FR points to its mate on that continent to insure 
publications prefer a local destination first and then look to publish to 
a remote destination second.

If the explicit CLUSSDRs are defined with care, I have had no problems 
with topologies supporting more than two FRs.  But again, I don't 
recommend it without a valid reason.

One final note - ALWAYS host your FR on a dedicated QMgr which does not 
also host application queues.  There are a couple of reasons for this:

1) You want to be able to take the FR down and apply maintenance to it 
without having to get on some application's release schedule.  If there is 
a problem with clustering that IBM releases a fix for, in theory you 
should be able to apply it during the day with no downtime by taking down 
one FR at a time.  You are limited here only by your change control 
procedures IF there is no application on the same QMgr.

2) If you are troubleshooting a cluster problem on an application node, 
you may need to issue a REFRESH CLUSTER command to pick up a new or 
missing queue definition.  If you are troubleshooting on a FR you might 
want to issue a RESET CLUSTER command to drop and rejoin a PR.  These 
commands are, in practice, mutually exclusive.  If you issue the REFRESH 
on a FR, you may generate a LOT of cluster traffic and find this 
disruptive to the cluster but if you host an app on an FR you may have no 
choice.  If you are trying to synch up a couple of repositories and issue 
a RESET, your app hosted on the same FR QMgr may lose visibility to 
cluster queues.

-- T.Rob

T.Robert Wyatt, Consulting IT Specialist 
IBM Software Services for Websphere
email: t.rob.wyatt@...
704-719-2107 Access Line

MQSeries List <MQSERIES@...> wrote on
10/31/2006 
04:05:16 PM:

> 
> Hi Douglas, 
> 
> I would recommend that the new machines are defined as partial 
repositories. 
> 
> The manuals show that more than 2 full repositories are supported, 
> and it always sounds like a good idea. However, in my experience the
> actual implementation of repository update strategies by the MQ 
> software means that having more than 2 full repositories doesn't 
> really help, and can make MQ behave in unexpected ways. 
> 
> The core of the issue is that when a queue manager sends a 
> repository update or inquiry, it sends it to exactly 2 full 
> repositories. It will always send it to the FR for which there is a 
> static sender channel defined, and it will send to 1 other, 
> determined in some way by the software. 
> 
> The problem comes when FRs are down (either planned or unplanned). 
> One might expect (I certainly did) that as long as at least 1 FR is 
> up, your cluster should be healthy. Unfortunately, the repos update 
> strategy means that if you have say 3 FRs, and the 2 which your PR 
> sends updates to are down, then no updates or requests from that PR 
> can be handled. This leads to things like issuing a suspend command,
> which gets processed, but the FR and the rest of the cluster don't 
> get to see it unless one of the 2 update FRs are up. Messages keep 
> arriving at the 'suspended' queue manager. 
> 
> So, the bottom line is that no matter how many FRs are in your 
> cluster, the cluster is not healthy unless a maximum of 1 FR is 
> unavailable. Statistical analysis shows that the more FRs there are,
> the more likely it is that more than one will be unavailable, and 
> therefore the more likely it is that your cluster could have a problem. 
> 
> So my belief is that 2 FRs is a special number which is always the 
> right number to use. 
> 
> FRs and PR which host application queues participate equally in load
> balancing, although messages will not leave a queue manager which 
> hosts a target queue unless you are running version 6, where there 
> is an option available to load balance away from the current 
> machine. At v5, if one of the target queues in the cluster is on the
> current queue manager, the messages will always go there. 
> 
> My last piece of advice on clusters is that if you have a 
> requirement for timely performance in a request/response model, with
> messages flowing via a cluster, don't put the FRs on a queue manager
> which has application queues. Use dedicated queue managers (even if 
> you run them on the same machines). I am chasing some performance 
> issues at my site where messages get delayed when the FR amqrrmfa 
> process gets busy managing  cluster stuff, and it stops responding 
> to application requests in a timely manner.
> 
> Regards, 
> 
> Neil Casey 
> Lead Technical Specialist, Middleware MQ Support 
> Technology Operations 
> National Australia Bank 
> 
> Level 1, 122 Lewis Rd  (Tue Thu) 
> Tel: +61 (0) 3 9886 2375  |  Fax: +61 (0) 3 9886 2700  |  Mob: +61 
> (0) 408 356 208 
> Pier 4, Level 8, 800 Bourke St  (Mon Wed Fri) 
> Tel: +61 (0) 3 8634 2205  |  Fax: +61 (0) 3 8634 3788  |  Mob: +61 
> (0) 408 356 208 
> Email: Neil.Casey@...
> 
> 
> 
> 

> 
> "Clark, Douglas" <dclark@...> 
> Sent by: MQSeries List
<MQSERIES@...> 
> 01/11/2006 07:08 
> 
> Please respond to
> MQSeries List <MQSERIES@...>
> 
> To
> 
> MQSERIES@... 
> 
> cc
> 
> Subject
> 
> Re: Cluster question;
> 
> 
> 
> 
> Due to significant increase in the number of messages we will process
> for Christmas I have to prepare for the addition of machines to my
> current environment.
> 
> Currently, I have four client machines and four queue managers (WC1,
> WC2, WC3, WC4) sending data to a clustered queue (PROD_Q1) defined on
> two server machines with queue managers WS1 and WS2 each hosting a full
> repositories.
> 
> I am planning to add four more client machines and queue managers (WC5,
> WC6, WC7, WC8) as well as two more server machines and queue managers
> (WS3 and WS4).
> 
> The queue managers WS3 and WS4 will also have defined the same clustered
> queue (PROD_Q1) and must be able to receive data from all eight client
> machines (the original 4 and the 4 new servers recently added).
> 
> Should I make queue managers WS3 and WS4 full repositories or just
> partial repositories.  Will the PROD_Q1 on the two new server machines
> participate equally in the load balancing done for WS1 and WS2.
> 
> To unsubscribe, write to
LISTSERV@... and,
> in the message body (not the subject), write: SIGNOFF MQSERIES
> Instructions for managing your mailing list subscription are provided in
> the Listserv General Users Guide available at http://www.lsoft.com
> Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html

> 
> National Australia Bank Ltd - ABN 12 004 044 937
> This email may contain confidential information. If you are not the 
> intended recipient, please immediately notify us at postmaster <at> nab.
> com.au or by replying to the sender, and then destroy all copies of 
> this email. Except where this email indicates otherwise, views 
> expressed in this email are those of the sender and not of National 
> Australia Bank Ltd. Advice in this email does not take account of 
> your objectives, financial situation, or needs. It is important for 
> you to consider these matters and, if the e-mail refers to a 
> product(s), you should read the relevant Product Disclosure 
> Statement(s)/other disclosure document(s) before making any 
> decisions. If you do not want email marketing from us in future, 
> forward this email with "unsubscribe" in the subject line to 
> Unsubscriptions@... in order to stop marketing emails from 
> this sender. National Australia Bank Ltd does not represent that 
> this email is free of errors, viruses or interference.
> 
> 

> List Archive - Manage Your List Settings - Unsubscribe 
> Instructions for managing your mailing list subscription are 
> provided in the Listserv General Users Guide available at 
http://www.lsoft.com

To unsubscribe, write to LISTSERV@... and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Heggie, Peter | 1 Nov 2006 14:37

Re: MQ v6 z/os remote administration of MQ v5.3 distributed

I'm 99% sure that the message was non-persistant, and I am 100% sure that the z/os qmgr has a dead letter queue.
 
I increased the wait time to 60 seconds. In that time I was able to go into the z/os MQ operational controls ISPF application and copy the name of the CSQUTIL temporary reply queue, and then go to the Windows qmgr and create a remote queue that pointed to that temporary queue on z/os, and then run amqsput to send messages to that temporary queue. I switched back to ISPF and invoked PQ Edit on that temporary queue, and saw my manually submitted messages. The CSQUTIL batch job ended with an error - it was unable to close the temporary queue - I assume it was because I had it open in PQ Edit, and the CSQUTIL utility wanted to close w/delete. The messages remained in that temporary queue.
 
I will assume that the reply messages sent by the remote qmgr in response to the CSQUTIL commands are in fact read by CSQUTIL but are discarded, perhaps because of the invalid PCF format, or maybe the CorrelID does not match (does the CSQUTIL read of reply messages depend on CorrelID? Perhaps not, as the messages are sent to a temporary queue dedicated to that request.. or is the temporary queue dedicated to the session??)
 
I will also assume that what I am doing is not supported. I am curious, but I want to give my users something. I may ask them to use a Windows-based utility to run this process, and then I can use MO72.
 
From: MQSeries List [mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org] On Behalf Of Russell Finn
Sent: Wednesday, November 01, 2006 4:21 AM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
Subject: Re: MQ v6 z/os remote administration of MQ v5.3 distributed


If the message is non-persistent and the z/OS queue manager has no dead letter queue defined and the channel is defined using NPMSPEED(FAST) and the target/reply queue on z/OS can not be put to (full, put-disabled, authority failure etc) then I think the message can't go anywhere and so is discarded.  A report message would be generated if requested in the MQMD.Report field though.

Might you be in this situation?  If so, configure a DEADQ on the z/OS QMgr and/or use NPMSPEED(NORMAL)

Russell

Russell Finn                    
MQSeries System Test    
russell_finn-ygUJEDcBm8rQT0dZR+AlfA@public.gmane.org




"Heggie, Peter" <Peter.Heggie-IPhzozbAnH1+cjeuK/JdrQ@public.gmane.org>
Sent by: MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

31/10/2006 18:41

Please respond to
MQSeries List <MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org>

To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
MQ v6 z/os remote administration of MQ v5.3 distributed





 
I'm trying to use the z/os CSQUTIL batch utility to issue commands to a
remote queue manager. The remote qmgr is a Windows v5.3 qmgr. I ran into
some security problems and resolved them, to the point where I can issue
a Display Qlocal command to a remote qmgr without the request ending up
in the dead letter queue of the remote qmgr. But I never get the reply.

I can see that the channel status of the reply route channel increases
by one, indicating (I think) that a message is sent back to the (z/os)
requestor, but I don't see the message anywhere. There is nothing in the
z/os DLQ. There are no messages in the z/os started tasks and no
messages in the system (SDSF) log. The CSQUTIL batch job reports a
timeout after waiting the specified RESPTIME.

As I know that MQ does not lose messages, I am puzzled as to why the
channel status of the 'reply' channel shows that a message was sent, yet
there is no sign of that message.

A further clue is that, before I fixed the last security problem, a
message showed up on the remote qmgr DLQ with a 'NOT AUTHORIZED' error
on the (temporary) reply queue on z/os. This queue name is
SYSTEM.CSQUTIL.BFA2F16F88E9DF82. This queue did exist, for 30 seconds
(RESPTIME), on the z/os qmgr, while the CSQUTIL utility waited for a
reply. Perhaps I incorrectly read into this error that there was an
actual reply to the remote admin request, and it was on its way back to
z/os, but was stopped by security. I gave the userid of the request the
ability to Put to the transmit queue pointing back to the requestor
(z/os) and after that got no more messages on the remote DLQ.

I also noticed that this 'reply' message actually had no content in it,
aside from the original request. The was no SYSTEM.COMMAND.INPUT queue
on the Windows qmgr, so I created an alias by that name, pointing to the
SYSTEM.ADMIN.COMMAND.QUEUE. I can see that the Windows command server
process has this queue open exclusively. And the previously mentioned
'reply' message was sent by this command server.

My question to all is - what happened to the reply message?

Perhaps a better question is - is there a better way to display the
properties of a remote queue located on a distributed v5.3 qmgr, from a
z/os v6 qmgr, without writing a program for the distributed qmgr?

**** For your information: the Rhode Island Operations of New England Gas Company have been acquired by National Grid and are now doing business under that name.  ****


This e-mail and any files transmitted with it, are confidential to National Grid and are intended solely for the use of the individual or entity to whom they are addressed.  If you have received this e-mail in error, please reply to this message and let the sender know.

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Paul S Dennis | 1 Nov 2006 14:42
Picon
Favicon

Re: CF Recovery in a QSG Environment


Heinz,

1) this is controlled by the EXCLINT parameter on the backup cfstruct command. Is basically says that only data older than this value should be backed. The default is 30 secs. The assumption is that most data on shared queues is short lived, so by only backing up stuff over a certain age we can optimise the backup processing.

2) My understanding is that you just have to issue the RECOVER CFSTRUCT command, although i am not totally up to speed with where the CFRM policy exists etc! The size and attributes of the structure are defined in the CFRM policy, and this is used by the CF when we need to allocate the structure. Basically MQ will attempt to connect to the structure, and if it doesn't exist then it will get allocated with whatever attributes are defined in the CFRM policy.

3)You need to issue the RECOVER CFSTRUCT command to initate recovery. I certainly didn't intend to imply anything else.... What I did say though was that you don't need to have all the qmgrs in the QSG running to be able to recover the structure.

Thanks
Paul


Paul Dennis
WebSphere MQ for z/OS Development
IBM Hursley



Heinz Klein <hklein <at> oltp.com.br>
Sent by: MQSeries List <MQSERIES <at> LISTSERV.MEDUNIWIEN.AC.AT>

01/11/2006 00:54

Please respond to
hklein-eo0T4D/ZA8YIdKJ7tpkyPg@public.gmane.org

To
MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
cc
Subject
Re: CF Recovery in a QSG Environment





Paul.

Thank you very nuch for the explanation. However, I will grab the opportunity and, if you allow me, squeeze a little further:
1) Can you please ellaborate what you mean by "persistent messages over a certain age"?
2) Assuming the CF structure was completely wiped out (failure or something equivalent to a power-on-reset) does the people responsible for the CF need to do something (define lists and/or structures) before MQ recovers the queues or can the whole recovery process be left to MQ?
3) My understanding was that the recovery process of a failed CF would need to be initiated by a RECOVER CFSTRUCT command. Somehow I got the impression from your text that even without this command being issued MQ would do the recovery of the shared queues upon restart. Is this correct?

Thank you again.

Heinz

Paul S Dennis wrote:

Hubert, that is not quite right! The contents of the shared queues are rebuilt when you do a RECOVER CFSTRUCT.

When the BACKUP CFSTRUCT command is issued the contents of the CF (persistent messages over a certain age) are copied from the CF structure being backed up into the log of the qmgr doing the backup. At a later stage, if recovery is needed then a qmgr will read this backup, and then the logs from this point on of all the qmgrs in the QSG to rebuild the message content of the queues that were on the failed structure. You don't need all of the qmgrs to be started to do recovery, just the logs to be available and readable by the one queue manager that is doing recovery (hence why you need SHAREOPTION(2 3) on the active logs, they wouldn't be readable by another qmgr otherwise). If you deem that the amount of time that will be required to do a recovery of the CF structure is too great (you forgot to backup the structure on a regular basis!) it is possible to use the RECOVER CFSTRUCT TYPE(PURGE) command to make the queues/structure available, but they won't contain any messages. There is no standard way of recovering a CF structure to a local queue.

The original logic behind having a manual recovery of the structures is the assumption here that if you had a failure of one structure, you may well have the failure of multiple structures close together. The RECOVER CFSTRUCT command allows you to specify multiple structures to recover at the same time, it can then recover all of the structures with a single read of the logs, rather than having to read for each structure.

Thanks
Paul


Paul Dennis
WebSphere MQ for z/OS Development
IBM Hursley


Hubert Kleinmanns <Hubert.Kleinmanns-aEaqbU0zMOo@public.gmane.org>
Sent by: MQSeries List <MQSERIES <at> LISTSERV.MEDUNIWIEN.AC.AT>

31/10/2006 07:07


To
MQSERIES-0lvw86wZMd9k/bWDasg6f2VJ6XI05myT@public.gmane.org.AT
cc
Subject
Re: CF Recovery in a QSG Environment







Hello,

There are two MQSC commands BACKUP CFSTRUCT and RECOVER CFSTRUCT, which allow a manual rebuild of the CF structures. As far as I know, these commands rebuild only the structure, not the contents. Rebuilding the shared queues will be done by the members of the QSG. In difference to local queues the contents of shared queues are spread over the active logs of several QMgrs in the QSG. So you need not anly one but all QMgrs of a QSG, to rebuild shared queues.

Regards
Hubert


> -----Ursprüngliche Nachricht-----
> Von: hklein-eo0T4D/ZA8YIdKJ7tpkyPg@public.gmane.org
> Gesendet: 30.10.06 22:23:22
> An: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
> Betreff: CF Recovery in a QSG Environment


> Hello MQers.
>
> I have two questions regarding disaster recovery using QSG:
>
> 1) I was told that other products (DB2 was explicitly mentioned) rebuild
> the Coupling Facility lists and structures themselves if the CF is lost,
> without need of manual intervention. Does MQ do something similar?
> 2) Assuming a CF is really lost, is there any way to rebuild the shared
> queues as local queues from the MQ log?
>
> Thanks in advance for any help.
>
> Heinz Klein
> OLTP Tecnologia & Solucoes Ltda.
> Sao Paulo/SP - Brasil
>
> To unsubscribe, write to LISTSERV-0lvw86wZMd8i8up1IyLKRA@public.gmane.orgUNIWIEN.AC.AT and,
> in the message body (not the subject), write: SIGNOFF MQSERIES
> Instructions for managing your mailing list subscription are provided in
> the Listserv General Users Guide available at http://www.lsoft.com
> Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html

--
Hubert Kleinmanns
Beratung / Schulung / Projektleitung

Tel.: +49 (0) 60 78 / 7 12 21
Fax: +49 (0) 60 78 / 7 12 25
Mobil: +49 (0) 178 / 6 97 22 54

To unsubscribe, write to LISTSERV-0lvw86wZMd9k/bWDasg6f1UEHEMvulL3@public.gmane.orgEN.AC.AT and,
in the message body (not the subject), write: SIGNOFF MQSERIES
Instructions for managing your mailing list subscription are provided in
the Listserv General Users Guide available at http://www.lsoft.com
Archive: http://listserv.meduniwien.ac.at/archives/mqser-l.html



List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com



No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.13.18/506 - Release Date: 30-Oct-06
 


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com


List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com

Heinz Klein | 1 Nov 2006 16:07
Picon

CF recovery in a QSG environment

Yvette.

Thank you for the reply. My customer HAS two CFs, but they still insist on considering the scenario where BOTH fail beyond recovery. A pretty pessimistic approach, in my opinion, but I need to have the appropriate answers...

As to the MAKEDEF, it will definitey exist, but we would like to recover the messages as well.

Cheers.

Heinz

Carroll, Y. (Yvette) wrote:

Hi

We were worried about exactly the same thing (loss of the CF) when we put in queue sharing.  So we put in two CF's to try and cater for the loss of one.

And I THINK, if you set up a makedef for your shared queues, as an object backup method, and then do global changes to set up the defs as local instead of shared, you might be able to use that to rebuild the queues as local after a failure.

Kind regards
Yvette

-----Original Message-----
From: MQSeries List [mailto:MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org] On Behalf Of Hubert Kleinmanns
Sent: 31 October 2006 09:08 AM
To: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
Subject: Re: CF Recovery in a QSG Environment

Hello,

There are two MQSC commands BACKUP CFSTRUCT and RECOVER CFSTRUCT, which allow a manual rebuild of the CF structures. As far as I know, these commands rebuild only the structure, not the contents. Rebuilding the shared queues will be done by the members of the QSG. In difference to local queues the contents of shared queues are spread over the active logs of several QMgrs in the QSG. So you need not anly one but all QMgrs of a QSG, to rebuild shared queues.

Regards
Hubert


> -----Ursprüngliche Nachricht-----
> Von: hklein-eo0T4D/ZA8YIdKJ7tpkyPg@public.gmane.org
> Gesendet: 30.10.06 22:23:22
> An: MQSERIES-0lvw86wZMd9k/bWDasg6f+2wyY2g16FtwPuJ0ROkVbw@public.gmane.org
> Betreff: CF Recovery in a QSG Environment


> Hello MQers.
>
> I have two questions regarding disaster recovery using QSG:
>
> 1) I was told that other products (DB2 was explicitly mentioned)
> rebuild the Coupling Facility lists and structures themselves if the
> CF is lost, without need of manual intervention. Does MQ do something similar?
> 2) Assuming a CF is really lost, is there any way to rebuild the
> shared queues as local queues from the MQ log?
>
> Thanks in advance for any help.
>
> Heinz Klein
> OLTP Tecnologia & Solucoes Ltda.
> Sao Paulo/SP - Brasil
List Archive - Manage Your List Settings - Unsubscribe

Instructions for managing your mailing list subscription are provided in the Listserv General Users Guide available at http://www.lsoft.com


Gmane