cbbrowne | 1 Jul 2006 04:36

Re: why set "forward" flag in SUBSCRIBE SET?

> Hi all,
>
> I'm not sure what the "forward" flag really does in the SUBSCRIBE SET
> slonik function.  Theoretically, you could always set "forward=yes"
> and then never subscribe anything to it--then there would be no
> reason to ever set it to "forward=no"
>
> Any insights?

It controls whether or not the subscriber records changes in its own
sl_log_1/sl_log_2 tables...

If forward=N, then you can't use that node as a source for another node.

In order to have the feeding...

A --> B --> C --> D

B needs to subscribe to A, with forwarding on;
C needs to subscribe to be, again, with forwarding on;
that allows D to subscribe to C; in that case, forwarding is optional.

If B had forwarding turned off, you couldn't set up a subscription from B
to C...

Does that help answer the question?

There's another detail, too, that comes up if you do a failover.  This may
seems obscure...

(Continue reading)

Christopher Nielsen | 3 Jul 2006 23:14
Picon

Re: db_getLocalNodeId() error

> > Shortly after, the slon instance connecting to the provider database began
> > emitting an error I'm not sure how to interpret:
> >
> >    ERROR  remoteListenThread_5: db_getLocalNodeId() returned 4 - wrong
> > database?

> This means that the sl_path.pa_conninfo on the node where you are
> getting that error to connect to node 5 actually contains the
> information how to connect to node 4.

> So you are looking for an sl_path entry that has pa_server = 5 and
> pa_client = <the one that spits the error>. Look twice, I know that is
> is wrong :-)

Oh, I do see a lot of incorrect connection information in sl_path.
Thanks!  Can I fix this by updating sl_path.pa_conninfo directly on my
one provider?

Also, I notice one of my clients, only has one row, and the one row it
does have, the pa_conninfo is set to '<event pending>'.

I know this node hasn't been recieving transactions. Can I fix this
somehow without dropping my nodes and adding them again?

Thanks very much,

Chris
cbbrowne | 4 Jul 2006 00:33

Re: db_getLocalNodeId() error

>> > Shortly after, the slon instance connecting to the provider database
>> began
>> > emitting an error I'm not sure how to interpret:
>> >
>> >    ERROR  remoteListenThread_5: db_getLocalNodeId() returned 4 - wrong
>> > database?
>
>> This means that the sl_path.pa_conninfo on the node where you are
>> getting that error to connect to node 5 actually contains the
>> information how to connect to node 4.
>
>> So you are looking for an sl_path entry that has pa_server = 5 and
>> pa_client = <the one that spits the error>. Look twice, I know that is
>> is wrong :-)
>
> Oh, I do see a lot of incorrect connection information in sl_path.
> Thanks!  Can I fix this by updating sl_path.pa_conninfo directly on my
> one provider?
>
> Also, I notice one of my clients, only has one row, and the one row it
> does have, the pa_conninfo is set to '<event pending>'.
>
> I know this node hasn't been recieving transactions. Can I fix this
> somehow without dropping my nodes and adding them again?

I'd be inclined to fix this by doing a fresh set of STORE PATH requests,
and letting them propagate.

Write up one slonik script that has ALL the paths, the way they should be.

(Continue reading)

Victoria Parsons | 4 Jul 2006 12:45

problems after move set

Hi All,
 
I have a 4 node test setup. It is very simple, with one set, containing one table. The origin is node 1. The view of sl_listen is as below. All nodes listen for events directly from the origin node of an event. (i.e. provider of events from node 3 is always node 3). Great so far..
 
li_origin | li_provider | li_receiver
-----------+-------------+-------------
         1 |           1 |           2
         1 |           1 |           3
         1 |           1 |           4
         2 |           2 |           1
         2 |           2 |           3
         2 |           2 |           4
         3 |           3 |           1
         3 |           3 |           2
         3 |           3 |           4
         4 |           4 |           1
         4 |           4 |           2
         4 |           4 |           3
After issuing a move set from node 1 to node 2 (which was successful) the listen table now looks like this:
 
 li_origin | li_provider | li_receiver
-----------+-------------+-------------
         1 |           1 |           2
         1 |           1 |           3
         1 |           1 |           4
         2 |           2 |           1
         2 |           1 |           3
         2 |           1 |           4
         3 |           3 |           1
         3 |           3 |           2
         3 |           3 |           4
         4 |           4 |           1
         4 |           4 |           2
         4 |           4 |           3
All db changes now happen in node 2, but instead of listening to node 2 directly, nodes 3 and 4 are getting the changes via provider node 1. This is not what I had expected, as presumeably the point of moving set is quite often to then take down the old origin. Anyway, I thought no worries, I'll just send a couple of store path commands to get nodes 3 and 4 listening to node 2 directly.
 
I did this using
<slon preamble>
store listen (origin=2, provider=2, receiver=3);
store listen (origin=2, provider=2, receiver=4);
This seemed to have no effect at all and I can't see any evidence of the store listen commands being processed.
 
Is what I am doing correct? Is it possible to now get nodes 3 and 4 listening directly to node 2, so I can drop node 1 from the replication completely?
 
Thanks for your help,
Vicki
 
 
 
 


This message should be regarded as confidential. If you have received this
email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in hard copy
by an authorized signatory.
_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://gborg.postgresql.org/mailman/listinfo/slony1-general
Brad Nicholson | 4 Jul 2006 20:01

Re: problems after move set

Victoria Parsons wrote:

> All db changes now happen in node 2, but instead of listening to node 2 directly, nodes 3 and 4 are getting the
changes via provider node 1. This is not what I had expected, as presumeably the point of moving set is quite
often to then take down the old origin. Anyway, I thought no worries, I'll just send a couple of store path
commands to get nodes 3 and 4 listening to node 2 directly.
>  
> I did this using
> <slon preamble>
> store listen (origin=2, provider=2, receiver=3);
> store listen (origin=2, provider=2, receiver=4);
> 
> This seemed to have no effect at all and I can't see any evidence of the store listen commands being processed.

In some versions of slony, store listen is a no-op (it will work again
as of 1.2).  Looks like you have one of those versions.  The store path
command will automatically generate the listen paths for you, if you
re-run that it should generate the paths.

Alternatively, you can insert the rows directly in sl_listen on all
nodes and restart the slons.

--

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.
David Gagnon | 4 Jul 2006 22:13

Is slony suitable to replicate a dbServer with 250+ db with 150 tables each?

Hi all,

  I think the subject of this mail tel everything.  I search several 
list and haven't found an answer to my question.  If Slony-I cannot do 
the job is Mammoth can do it?  Or any other solution to replicate data 
on a slave server!

Thanks for your help!
Best Regards
/David
cbbrowne | 5 Jul 2006 14:15

Re: problems after move set


> I did this using
> <slon preamble>
> store listen (origin=2, provider=2, receiver=3);
> store listen (origin=2, provider=2, receiver=4);
>
> This seemed to have no effect at all and I can't see any evidence of the
> store listen commands being processed.

STORE LISTEN was turned into a NOOP in version 1.1, which is probably a
mistake, albeit well-intentioned.  The paths are calculated based
primarily on what subscriptions you have in place.

The algorithm for calculating the paths in sl_listen changes in 1.2, for
the better but it seems to me that STORE LISTEN should still be available
even though you'd usually prefer not to use it...

> Is what I am doing correct? Is it possible to now get nodes 3 and 4
> listening directly to node 2, so I can drop node 1 from the replication
> completely?

Well, do you have paths in sl_path for that?  If you do, then they can do so.

When you drop node 1, the paths in sl_listen will get reshaped based on
the way things look after the node is gone.

The two followup questions would be:

1.  Are there entries in sl_path for nodes 3+4 to get directly to node 2?

2.  What is the shape of the subscriptions in sl_subscribe?  The thing
that would be a bad thing would be if anything was still being provided
data by node 1.

You can't drop node 1 until any subscribers that go to node 1 are
redirected to some other provider.
cbbrowne | 5 Jul 2006 14:27

Re: Is slony suitable to replicate a dbServer with 250+ db with 150 tables each?

> Hi all,
>
>   I think the subject of this mail tel everything.  I search several
> list and haven't found an answer to my question.  If Slony-I cannot do
> the job is Mammoth can do it?  Or any other solution to replicate data
> on a slave server!

You mean that you have a PostgreSQL postmaster that has 250 databases
(e.g. - you ran createdb 250 times) and each of those has ~150 tables?

That seems like a case that will not likely turn out well with Slony-I.

The trouble is that Slony-I replicates from *a database* at a time.

If you have 250 databases on that PG cluster, then you need to have 250
slon processes, and, more than likely, 250 Slony-I clusters to manage.

Further, supposing you want to have two backup servers, that'll mean...

- 750 slon processes
- For each one, probably 4 connections, hence 3000 database connections,
1000 per postmaster

You'll find memory consumption to be hurtful, and this represents a pretty
big "thundering herd" of replication apparatus.

I'm not sure if Mammoth would provide you a better or worse experience, in
this scenario.  I think you'll need to contact the vendor about that.

The other possibility that I'd point to for situations like this is the
new-in-version 8 Point In Time Recovery system for PostgreSQL.  It will
NOT allow you to use the standby server(s) to respond to queries, alas. 
But it would inexpensively address your "explosion" of schemas.
Victoria Parsons | 5 Jul 2006 15:33

Re: problems after move set

Hi Chris

Thanks for the advice. I have recently moved from using 1.1.0 to 1.1.5,
and hadn't realised STORE LISTEN was dead (I am using a link to 1.1.0
docs). In answer to your follow up questions:

I have a complete set of cross paths for all nodes to every other node.
I set these up as each node is added to the cluster for the purpose of
having an easy time if I need to move set / failover in the future. 

I also have a complete set of listens, whereby every node listens to
every other node directly (origin==provider)

Before the move (originally all nodes receive from node 1) the subscribe
table is as below

sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+--------------+--------------+-------------+------------
       1 |            1 |            2 | t           | t
       1 |            1 |            3 | t           | t
       1 |            1 |            4 | t           | t

After the move set, node 2 accepts data (move commands below)

<preamble>
Lock set (id=1, origin=1);
Wait for event (origin=1, confirmed=all);
Move set (id=1, old origin=1, new origin=2);
Wait for event (origin=1, confirmed=all);

I am now left with a set that has origin on node 2, but nodes 3 and 4
are still getting their data via node 1.

sub_set | sub_provider | sub_receiver | sub_forward | sub_active
---------+--------------+--------------+-------------+------------
       1 |            1 |            3 | t           | t
       1 |            1 |            4 | t           | t
       1 |            2 |            1 | t           | t

The other change I see is that in sl_listen, two entries have changed. I
now have: 
origin=2, provider=1, receiver=3
origin=2, provider=1, receiver=4

Before the move these lines had provider=2, but were unused I suppose as
no data came from node 2 at the time.

So I have all the right paths set up, but I cannot change the listens
directly. What can I do to get nodes 3 and 4 listening/subscribing
directly to node 2?

My other question is, what do you expect to happen to the subscribe
table after a move set? From your question 2 it sounds like you don't
expect node 1 to be a provider any more. Is this a bug?

Thanks,
Vicki

-----Original Message-----
From: cbbrowne@...
[mailto:cbbrowne@...] 
Sent: 05 July 2006 13:16
To: Victoria Parsons
Cc: slony1-general@...
Subject: Re: [Slony1-general] problems after move set

> I did this using
> <slon preamble>
> store listen (origin=2, provider=2, receiver=3);
> store listen (origin=2, provider=2, receiver=4);
>
> This seemed to have no effect at all and I can't see any evidence of
the
> store listen commands being processed.

STORE LISTEN was turned into a NOOP in version 1.1, which is probably a
mistake, albeit well-intentioned.  The paths are calculated based
primarily on what subscriptions you have in place.

The algorithm for calculating the paths in sl_listen changes in 1.2, for
the better but it seems to me that STORE LISTEN should still be
available
even though you'd usually prefer not to use it...

> Is what I am doing correct? Is it possible to now get nodes 3 and 4
> listening directly to node 2, so I can drop node 1 from the
replication
> completely?

Well, do you have paths in sl_path for that?  If you do, then they can
do so.

When you drop node 1, the paths in sl_listen will get reshaped based on
the way things look after the node is gone.

The two followup questions would be:

1.  Are there entries in sl_path for nodes 3+4 to get directly to node
2?

2.  What is the shape of the subscriptions in sl_subscribe?  The thing
that would be a bad thing would be if anything was still being provided
data by node 1.

You can't drop node 1 until any subscribers that go to node 1 are
redirected to some other provider.

This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in hard copy 
by an authorized signatory.
Richard Yen | 5 Jul 2006 18:35

Re: why set "forward" flag in SUBSCRIBE SET?

Chris,

Thanks for the detailed response and descriptions, but I think you  
addressed the ramifications for setting "forward=yes" while my  
question addressed why anyone would set "forward=no"

Based on your descriptions, it seems like there is absolutely no  
reason to set "forward=no," except for better performance.  Is the  
increase in performance significant if I set "forward=no"?  Are there  
other justifications for setting "forward=no"?

Thanks for the help!
--Richard

On Jun 30, 2006, at 7:36 PM, cbbrowne@... wrote:

>> Hi all,
>>
>> I'm not sure what the "forward" flag really does in the SUBSCRIBE SET
>> slonik function.  Theoretically, you could always set "forward=yes"
>> and then never subscribe anything to it--then there would be no
>> reason to ever set it to "forward=no"
>>
>> Any insights?
>
> It controls whether or not the subscriber records changes in its own
> sl_log_1/sl_log_2 tables...
>
> If forward=N, then you can't use that node as a source for another  
> node.
>
> In order to have the feeding...
>
> A --> B --> C --> D
>
> B needs to subscribe to A, with forwarding on;
> C needs to subscribe to be, again, with forwarding on;
> that allows D to subscribe to C; in that case, forwarding is optional.
>
> If B had forwarding turned off, you couldn't set up a subscription  
> from B
> to C...
>
> Does that help answer the question?
>
> There's another detail, too, that comes up if you do a failover.   
> This may
> seems obscure...
>
> Suppose you have nodes A, B, C.
>
> A begins as the origin.
>
> B subscribes to A; forwarding turned on.
>
> C subscribes to A; forwarding turned OFF.
>
> Suppose A falls over.  The only failover option is to go to node B;  
> you
> can only failover to a forwarding node.
>
> There is a risk to having forwarding turned off on C; suppose B was  
> only
> up to date to event # 8901, whereas C was a bit ahead of that; it  
> was up
> to event 8904.  (Apparently node B wasn't replicating for a few  
> seconds,
> or something...)
>
> Alas, you cannot keep node C.  There is no way to bring B up to event
> #8904, as no remaining node has the data for SYNCs 8902, 8903, and  
> 8904.
>
> Resulting "rule of thumb":  Never have a node that is directly  
> connected
> to the origin have forwarding turned off...
>
> If you have a single node at a remote site, and you know you'd never
> consider failing over there, that node would be good to have  
> forwarding
> turned off on, as it saves a bunch of work populating sl_log_1.
>
>> Also, if I'm setting up a master->relay->offsite_backup architecture,
>> would I subscribe them all to the same set, having the relay node
>> forward to the offsite_backup node?  That's the way I got it to work,
>> but I'm not sure if there's another way, like, say, building a second
>> set on the relay machine, and have the offsite_backup subscribe to
>> that set (I'm under the impression that such a setup isn't possible).
>
> I'm not sure quite what you mean by the "another way"; what you've  
> done
> seems appropriate as how to handle cascaded subscribers...
>
>
>

Gmane