Fernando Schapachnik | 11 Mar 2005 17:32
Picon

Re: ERROR remoteListenThread_2: db_getLocalNodeId() returned 1 - wrong database?

Found the error. My "replicate all sequences script" also included sequences 
from _CLUSTERNAME schema into the replication set.

Thanks for the help anyway!

Fernando.

En un mensaje anterior, Fernando Schapachnik escribió:
> En un mensaje anterior, Christopher Browne escribió:
> > My suspicion is that there's some confusion taking place as to which 
> > host is which.
> > 
> > Try logging in using the DSN (e.g. - dbname=mydb host=slavehost 
> > user=slony) and see what you get...
> 
> Same thing...
> 
> > 
> > Is it possible that DNS resolution is messing up somehow, that 
> > "slavehost" is pointing to the wrong server?
> 
> No, DNS is OK. Are the examples Ok? I mean, is it right for the slave slon 
> process to point to the DB on the slave host or it should be the other way?
> 
> Regards.
> 
> Fernando.
Jan Wieck | 11 Mar 2005 16:48
Picon
Favicon

Re: disabling triggers

On 3/10/2005 3:32 PM, David Parker wrote:

> This is not strictly a slony question, but it has to do with some slony
> procedure code I would like to emulate... I have a data broadcasting
> functionality that is layered on top of a slony installation that I use
> to get data replicated out to nodes from a central location without
> those nodes having to be part of this slony cluster (all nodes are
> actually pairs, Active/Standby, and we are using our trusty slony to
> maintain data redundancy between them).
>  
> When a node subscribes I want to do an initial delete/copy of a set of
> tables the way slony does, but I need to get around FK constraints. My
> understanding of the altertableforreplication procedure is:
>  
>      - for each table that needs to be altered and has
> triggers/constraints
>             -update the pg_triggers record for each trigger to point to
> the table's primary key object (maintained
>             in the sl_table table)
>             - update the record for the current table in pg_class to
> decrement the reltriggers column
>  
> And then the restore reverses this process.
>  
> The procedure also disables rules - are both of these steps necessary to
> "disable" constraints, or is it sufficient to just point the triggers
> away?
>  
> I'm not sure I can implement this the same way, because I don't
> necessarily have a primary key to substitute with on all of the target
(Continue reading)

Jan Wieck | 11 Mar 2005 16:42
Picon
Favicon

Re: RE: slow replication since upgrade to 1.0.5

On 3/10/2005 2:04 PM, Michel.Diotte@... wrote:
> The vacuum has been done. But to make sure i did it again and the slave is
> still several minutes behind ...
> This is the log file of the slave if it can help ...
> 
> [...]
> DEBUG2 remoteWorkerThread_1: SYNC 248131 done in 0.087 seconds
> DEBUG2 remoteListenThread_1: queue event 1,248132 SYNC

It is done with processing event 248131 before event 248132 becomes 
visible ... I would call that "perfectly caught up".

What makes you think that it is several minutes behind?

Jan

--

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#==================================================
JanWieck@... #
Michel.Diotte | 11 Mar 2005 19:32

RE: slow replication since upgrade to 1.0.5


The slave is behind because when i modify or insert a record in the master
database,
 I do the same query on both database, and it appears in the slave several
minutes after.

======================================================================================================
The following command has been run on  March 11th on the slave
The master ( 136.100.100.247 ) seems to wait for a reply from the slave

[root <at> webapp-2 root]# ps -auxww | grep slon

slony    16455  0.0  0.1 61548 1748 ?        S    Mar10   0:00
/usr/local/slony/src/slon/slon -d 4 -s 10000 -g 10 replication dbname=lims
postgres 16460  0.5  0.3 10312 3496 ?        S    Mar10   5:50 postgres:
slony lims [local] async_notify waiting
postgres 16468  0.5  0.3 10572 3740 ?        S    Mar10   5:52 postgres:
slony lims 136.100.100.247 async_notify waiting
postgres 16471  0.3  0.3 11244 4120 ?        S    Mar10   3:24 postgres:
slony lims [local] INSERT waiting
postgres 16473  0.1  0.4 11168 4184 ?        S    Mar10   2:09 postgres:
slony lims [local] SELECT waiting
postgres 16475  0.2  0.3 10452 3704 ?        S    Mar10   2:52 postgres:
slony lims [local] COMMIT waiting
postgres 16736  0.5  0.3 10572 3748 ?        S    Mar10   5:29 postgres:
slony lims 136.100.100.247 async_notify waiting
postgres 16773  0.4  0.3 10572 3788 ?        S    Mar10   5:11 postgres:
slony lims 136.100.100.247 async_notify waiting
postgres 16791  0.4  0.3 10572 3784 ?        S    Mar10   4:57 postgres:
slony lims 136.100.100.247 async_notify waiting
(Continue reading)

Jan Wieck | 11 Mar 2005 19:50
Picon
Favicon

Re: RE: slow replication since upgrade to 1.0.5

I see ... your slony installation apparently suffers from pg_listener 
bloat. Please "vacuum full analyze" the pg_listener table on all nodes.

Jan

On 3/11/2005 1:32 PM, Michel.Diotte@... wrote:
> The slave is behind because when i modify or insert a record in the master
> database,
>  I do the same query on both database, and it appears in the slave several
> minutes after.
> 
> 
> ======================================================================================================
> The following command has been run on  March 11th on the slave
> The master ( 136.100.100.247 ) seems to wait for a reply from the slave
> 
> [root <at> webapp-2 root]# ps -auxww | grep slon
> 
> slony    16455  0.0  0.1 61548 1748 ?        S    Mar10   0:00
> /usr/local/slony/src/slon/slon -d 4 -s 10000 -g 10 replication dbname=lims
> postgres 16460  0.5  0.3 10312 3496 ?        S    Mar10   5:50 postgres:
> slony lims [local] async_notify waiting
> postgres 16468  0.5  0.3 10572 3740 ?        S    Mar10   5:52 postgres:
> slony lims 136.100.100.247 async_notify waiting
> postgres 16471  0.3  0.3 11244 4120 ?        S    Mar10   3:24 postgres:
> slony lims [local] INSERT waiting
> postgres 16473  0.1  0.4 11168 4184 ?        S    Mar10   2:09 postgres:
> slony lims [local] SELECT waiting
> postgres 16475  0.2  0.3 10452 3704 ?        S    Mar10   2:52 postgres:
> slony lims [local] COMMIT waiting
(Continue reading)

Vivek Khera | 11 Mar 2005 19:54

Re: Moving Towards Slony-I 1.1


On Mar 11, 2005, at 11:03 AM, Jan Wieck wrote:

> We have recently discovered that MOVE_SET has a race condition. If for 
> example one has nodes 1, 2 and 3. 1 being origin, 2 and 3 being 
> subscribers. If one now does a MOVE_SET to transfer the origin to node 
> 2 there is a possibility that node 2 processes that MOVE_SET, opens up 
> for business and generates SYNC's (so far this is what we want).
>

Is there a possibility that a similar race exists for when you move a 
node's origin?

When I brought on a new server, I had done it where origin node 2 fed 
replica node 1 which fed replica node 3:

2 -> 1 -> 3

then I moved it so 3 fed right from 2:

2 -> 1
2 -> 3

I waited a while to make sure all replicas were "up to date" (ie, no 
events backlogged according to sl_status).  Then I went to remove node 
1, and it seems that it all stopped since the listen paths were 
reconfigured in seemingly the wrong order (ie, old listens removed 
before new listens were added).

This is when I had to go into the event table on the origin and remove 
(Continue reading)

Jan Wieck | 11 Mar 2005 20:36
Picon
Favicon

Re: Moving Towards Slony-I 1.1

On 3/11/2005 1:54 PM, Vivek Khera wrote:

> On Mar 11, 2005, at 11:03 AM, Jan Wieck wrote:
> 
>> We have recently discovered that MOVE_SET has a race condition. If for 
>> example one has nodes 1, 2 and 3. 1 being origin, 2 and 3 being 
>> subscribers. If one now does a MOVE_SET to transfer the origin to node 
>> 2 there is a possibility that node 2 processes that MOVE_SET, opens up 
>> for business and generates SYNC's (so far this is what we want).
>>
> 
> Is there a possibility that a similar race exists for when you move a 
> node's origin?

If you mean the nodes "data provider", then no, there is no such race 
condition there (at least none I know of).

Jan

> 
> When I brought on a new server, I had done it where origin node 2 fed 
> replica node 1 which fed replica node 3:
> 
> 2 -> 1 -> 3
> 
> then I moved it so 3 fed right from 2:
> 
> 2 -> 1
> 2 -> 3
> 
(Continue reading)

Andreas Pflug | 11 Mar 2005 20:42
Picon

Re: Moving Towards Slony-I 1.1

Christopher Browne wrote:
> I have checked in a fair "boatload" of patches supporting this and that; 
> from my perspective, I think it starts making sense to look towards a 
> 1.1 release in the next couple of weeks.

IMHO at least partial win32 support would be nice.
The issues I'm aware of are:
- slony1_funcs works ok if linked using changed makefiles, patch was posted.
- slon needs minimal tweaking because a (portable) function needs to be 
used to compare pthread_t datatypes. It doesn't link, which is a 
makefile problem. Somebody else who's more experienced with makefiles 
should do this.
- slonik would need much more work; pgAdmin can replace it. Using as few 
sql scripts to initially create a cluster would be helpful here (to join 
an existing cluster, pgAdmin extracts schema and functions from an 
existing node).

Regards,
Andreas
Jan Wieck | 11 Mar 2005 17:03
Picon
Favicon

Re: Moving Towards Slony-I 1.1

On 3/10/2005 11:41 PM, cbbrowne@... wrote:

>> I have checked in a fair "boatload" of patches supporting this and that;
>> from my perspective, I think it starts making sense to look towards a
>> 1.1 release in the next couple of weeks.

You certainly want to add the "ACCEPT_SET" feature to the open items for 
1.1.

We have recently discovered that MOVE_SET has a race condition. If for 
example one has nodes 1, 2 and 3. 1 being origin, 2 and 3 being 
subscribers. If one now does a MOVE_SET to transfer the origin to node 2 
there is a possibility that node 2 processes that MOVE_SET, opens up for 
business and generates SYNC's (so far this is what we want).

If now node 3 is behind in replicating from 1, but keeps well up with 
events from node 2, it will confirm SYNC events coming from node 2 
(assuming "I am not subscribed to anything from there, so nothing to 
do") until it actually has caught up with 1 up to the MOVE_SET event.

The cure for this is a new event type ACCEPT_SET that is generated by 
the new origin when it processes the MOVE_SET event. The payload 
information of ACCEPT_SET is the node id and the event id of the 
MOVE_SET event. When processing an ACCEPT_SET event, the worker thread 
will check if the local node has processed that MOVE_SET from the other 
node. If not, the worker thread will error out and retry in 10 seconds.

In the example above, node 3's worker thread 2 will receive the 
ACCEPT_SET, notice that worker thread 1 hasn't processed the MOVE_SET 
yet, suspend event processing for 10 seconds and retry. Since all 
(Continue reading)

Christopher Browne | 11 Mar 2005 21:10

Re: Moving Towards Slony-I 1.1

Andreas Pflug wrote:

> Christopher Browne wrote:
>
>> I have checked in a fair "boatload" of patches supporting this and 
>> that; from my perspective, I think it starts making sense to look 
>> towards a 1.1 release in the next couple of weeks.
>
>
> IMHO at least partial win32 support would be nice.
> The issues I'm aware of are:
> - slony1_funcs works ok if linked using changed makefiles, patch was 
> posted.
> - slon needs minimal tweaking because a (portable) function needs to 
> be used to compare pthread_t datatypes. It doesn't link, which is a 
> makefile problem. Somebody else who's more experienced with makefiles 
> should do this.
> - slonik would need much more work; pgAdmin can replace it. Using as 
> few sql scripts to initially create a cluster would be helpful here 
> (to join an existing cluster, pgAdmin extracts schema and functions 
> from an existing node).

I have taken the material that Andreas posted back in February and added 
it to a "Win32" directory; that looks like all that can be offered as 
support at this point.

If someone can provide more meaningful Win32 support, then that would be 
nice.  This would requiree that a volunteer emerge who is prepared to 
take responsibility for making Slony-I components work on Win32.
(Continue reading)


Gmane