Rob Brucks | 26 Feb 21:51 2016

Replication Lag?

I have a fairly simple test cluster, three PG instances running on different ports:  a master and two slave
DBs both subscribed to the master.  Everything is running on the same server (it's a playground), so I have
three PG instances (ports 5432, 5433, 5434; connecting via sockets in /tmp) and a slony daemon for each instance.

I'm running Postgres 9.3.9 and slony1 2.2.4 on Centos 6.7 x86_64.  Both postgres and slony were installed
via yum using the PGDG repo.

I used the config below to initialize a very simple replication setup of only one table and one sequence.

My problem is that if I shut down just one slave postgres instance then sl_status on the master instance
shows replication stalling on both slave DBs, instead of just one.

But, if I insert some test data into the master DB, I see the data show up on the remaining active slave.  So
replication to the remaining slave DB is obviously working.

We use sl_status to monitor replication so we need it to accurately report lag if there's an issue.  The Slony
1.2 version we used before did not behave this way, it accurately reported which slave was not replicating.

Why does sl_status report lag on the active slave even though replication appears to be working fine?

Do I have a misconfiguration somewhere?

Thanks,
Rob

Here's my slony config:

      CLUSTER NAME = slony;
      NODE 1 ADMIN CONNINFO = 'dbname=test_db host=/tmp port=5432 user=slony';
      NODE 2 ADMIN CONNINFO = 'dbname=test_db host=/tmp port=5433 user=slony';
(Continue reading)

Asad Shah | 9 Feb 21:46 2016
Picon

Is slony.info down?

I cannot get to the slony site right now?

Is it down?

Regards,
Asad


_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Picon

Pending synch size?

Is there some way to actually see how much data is still pending to be synch between nodes?

Atentamente / Regards,

Gonzalo Vásquez Sáez
+56 (2) 2963 4180
Director I+D / R&D Director
Waypoint Telecomunicaciones S.A.

Alfredo Barros Errázuriz 1953 Of. 1004
7500550
Providencia, Santiago, Chile
Mapcode: R3.BR










_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Sung Hsin Lei | 4 Feb 05:37 2016
Picon

Cannot fully drop slony node

Hey guys,

I have a cluster with 3 nodes. On the main db, I run the following script:


cluster name = slony_cluster;

node 1 admin conninfo = 'dbname = dbNAME host = localhost user = slonyuser password = slonPASS port = 5432';
node 3 admin conninfo = 'dbname = dbNAME host = 172.16.10.4 user = slonyuser password = slonPASS port = 5432';

DROP NODE ( ID = 3, EVENT NODE = 1 );



I open pdadmin on the main db and I don't see node 3 anymore. However, when I open pgadmin on the replicated db, I still see node 3. The replicated db is the one associated with node 3. I run the above script again on the replicated db but get the following error:


C:\Program Files\PostgreSQL\9.3\bin>slonik drop.txt
debug: waiting for 3,5000000004 on 1
drop.txt:4: PGRES_FATAL_ERROR lock table "_slony_securithor2".sl_event_lock, "_s
lony_cluster".sl_config_lock;select "_slony_securithor2".dropNode(ARRAY[3]);
  - ERROR:  Slony-I: DROP_NODE cannot initiate on the dropped node


Now I need to setup another node which must have id=3. I run a script on the main db(the one pgadmin does not show a node 3). The following is the script that I used to setup the node and the error that I get:


cluster name = slony_cluster;

node 1 admin conninfo = 'dbname = dbNAME host = localhost user = slonyuser password = slonPASS port = 5432';
node 3 admin conninfo = 'dbname = dbNAME host = 172.16.10.4 user = slonyuser password = slonPASS port = 5432';

store node (id=3, comment = 'Slave node 3', event node=1);
store path (server = 1, client = 3, conninfo='dbname=dbNAME host=172.16.10.3 user=slonyuser password = slonPASS port = 5432');
store path (server = 3, client = 1, conninfo='dbname=dbNAME host=172.16.10.4 user=slonyuser password = slonPASS port = 5432');

subscribe set ( id = 1, provider = 1, receiver = 3, forward = no);





C:\Program Files\PostgreSQL\9.3\bin>slonik create.txt
drop.txt:6: Error: namespace "_slony_cluster" already exists in database of
node 3



Is there another way to drop nodes? Can I recover from this without dropping the cluster and restarting from scratch?


Thanks.
_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Sung Hsin Lei | 29 Jan 04:57 2016
Picon

Replication inexplicably stops

Hello guys,

So I have this setup that has already stopped on me 3 times the last 6 months. Each time it would replicate properly for 2-3 months and then it would just stop. It currently is stopped since January 11, 2016. The only way I can get replication back is to set everything up from scratch. I'm wondering if anyone has an idea on the issue causing the stoppage. I'm running 64-bit slony 2.2.4.

Currently, when I run slon on the replicated machine, I get the following:



C:\Program Files\PostgreSQL\9.3\bin>slon slony_Securithor2 "dbname = Securithor2
  user = slonyuser password = securiTHOR971 port = 6234"
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: slon version 2.2.4 starting
 up
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option vac_frequenc
y = 3
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option log_level =
0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option sync_interva
l = 2000
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option sync_interva
l_timeout = 10000
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option sync_group_m
axsize = 20
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option quit_sync_pr
ovider = 0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option remote_liste
n_timeout = 300
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option monitor_inte
rval = 500
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option explain_inte
rval = 0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option tcp_keepaliv
e_idle = 0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option tcp_keepaliv
e_interval = 0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option tcp_keepaliv
e_count = 0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Integer option apply_cache_
size = 100
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Boolean option log_pid = 0
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Boolean option log_timestam
p = 1
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Boolean option tcp_keepaliv
e = 1
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Boolean option monitor_thre
ads = 1
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: Real option real_placeholde
r = 0.000000
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option cluster_name
= slony_Securithor2
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option conn_info = d
bname = Securithor2  user = slonyuser password = securiTHOR971 port = 6234
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option pid_file = [N
ULL]
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option log_timestamp
_format = %Y-%m-%d %H:%M:%S %Z
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option archive_dir =
 [NULL]
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option sql_on_connec
tion = [NULL]
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option lag_interval
= [NULL]
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option command_on_lo
garchive = [NULL]
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: String option cleanup_inter
val = 10 minutes
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: local node id = 2
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   main: main process started
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: launching sched_start_mainl
oop
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: loading current cluster con
figuration
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG storeNode: no_id=1 no_comment='Ma
ster Node'
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG storePath: pa_server=1 pa_client=
2 pa_conninfo="dbname=Securithor2 host=192.168.1.50 user=slonyuser password = se
curiTHOR971  port = 6234" pa_connretry=10
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG storeListen: li_origin=1 li_recei
ver=2 li_provider=1
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG storeSet: set_id=1 set_origin=1 s
et_comment='All tables and sequences'
2016-01-28 17:41:00 AmÚr. du Sud occid. WARN   remoteWorker_wakeup: node 1 - no
worker thread
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG storeSubscribe: sub_set=1 sub_pro
vider=1 sub_forward='f'
2016-01-28 17:41:00 AmÚr. du Sud occid. WARN   remoteWorker_wakeup: node 1 - no
worker thread
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG enableSubscription: sub_set=1
2016-01-28 17:41:00 AmÚr. du Sud occid. WARN   remoteWorker_wakeup: node 1 - no
worker thread
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: last local event sequence =
 5000462590
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG main: configuration complete - st
arting threads
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   localListenThread: thread starts
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname = Securithor2
  user = slonyuser password = securiTHOR971 port = 6234" is 90310
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=5188
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG enableNode: no_id=1
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   remoteWorkerThread_1: thread star
ts
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   remoteListenThread_1: thread star
ts
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   main: running scheduler mainloop
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG cleanupThread: thread starts
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   syncThread: thread starts
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   monitorThread: thread starts
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname = Securithor2
  user = slonyuser password = securiTHOR971 port = 6234" is 90310
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG remoteWorkerThread_1: update prov
ider configuration
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG remoteWorkerThread_1: added activ
e set 1 to provider 1
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname=Securithor2 h
ost=192.168.1.50 user=slonyuser password = securiTHOR971  port = 6234" is 90306
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname = Securithor2
  user = slonyuser password = securiTHOR971 port = 6234" is 90310
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG cleanupThread: bias = 60
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname = Securithor2
  user = slonyuser password = securiTHOR971 port = 6234" is 90310
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname = Securithor2
  user = slonyuser password = securiTHOR971 port = 6234" is 90310
2016-01-28 17:41:00 AmÚr. du Sud occid. CONFIG version for "dbname=Securithor2 h
ost=192.168.1.50 user=slonyuser password = securiTHOR971  port = 6234" is 90306
2016-01-28 17:41:00 AmÚr. du Sud occid. INFO   remoteWorkerThread_1: syncing set
 1 with 59 table(s) from provider 1




It gets stuck at "syncing set 1 with 59 table(s) from provider 1" (the last line) forever with the occasional messages that says something about cleaning(threadcleaning I thing).


Checking the postgres logs, I see lots of:

2016-01-28 17:33:07 AST LOG:  n'a pas pu recevoir les données du client : unrecognized winsock error 10061

Which translates to: 

2016-01-28 17:33:07 AST LOG:  was not able to receive the data from the client : unrecognized winsock error 10061

I'm able to connect to the main db from the replicated machine no problem. I have no idea how this error 10061 is caused.

Any ideas?

Appreciate the help.



_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Tignor, Tom | 28 Jan 14:30 2016

Slony-I: log switch to sl_log_2 still in progress - sl_log_1 not truncated


Hello slony folks,
From my reading I’m guessing (hoping) this isn’t a new problem. I have a simple cluster with one provider replicating to three subscribers. The provider’s changelog tables (sl_log_[1|2]) are fine, but the subscribers (with forwarding enabled) are all showing runaway growth. Looked through the FAQ and I don’t see the node I dropped or any idle transactions as viable culprits. Are there other thoughts on the cause? Can I safely manually delete/truncate some/all of the changelog tables? These replicas are all leaf nodes. I only have forwarding turned on to allow for failover, and my replication rate is the 2 sec default.
Thanks in advance for any insights.

ams=# select pg_size_pretty(pg_total_relation_size('_ams_cluster.sl_log_1'));

 pg_size_pretty 

----------------

 75 MB

(1 row)


ams=# select pg_size_pretty(pg_total_relation_size('_ams_cluster.sl_log_2'));

 pg_size_pretty 

----------------

 34 GB

(1 row)


ams=# select * from _ams_cluster.sl_confirm where con_origin not in (select no_id from _ams_cluster.sl_node) or con_received not in (select no_id from _ams_cluster.sl_node);

 con_origin | con_received | con_seqno | con_timestamp 

------------+--------------+-----------+---------------

(0 rows)


ams=# select * from pg_stat_activity where current_query like '%IDLE%';

 datid | datname | procpid | usesysid |  usename   |     application_name      |  client_addr   | client_hostname | client_port |         backend_start         |          xact_start           |          query_start          | waiting |   

                        current_query                           

-------+---------+---------+----------+------------+---------------------------+----------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+---------+---

----------------------------------------------------------------

 16393 | ams     |    2611 |   212995 | ams_viewer |                           | 88.221.209.10  |                 |       43328 | 2016-01-28 12:24:49.706389+00 |                               | 2016-01-28 13:18:02.427848+00 | f       | <I

DLE>

 16393 | ams     |   12894 |   212995 | ams_viewer |                           | 88.221.209.10  |                 |       60112 | 2016-01-28 12:47:26.230681+00 |                               | 2016-01-28 13:15:27.744242+00 | f       | <I

DLE>

 16393 | ams     |   12884 |   212995 | ams_viewer |                           | 88.221.209.10  |                 |       44302 | 2016-01-28 12:47:25.100006+00 |                               | 2016-01-28 13:15:27.936059+00 | f       | <I

DLE>

 16393 | ams     |   23466 |   213867 | ams_slony  | psql                      |                |                 |          -1 | 2016-01-28 13:11:32.030343+00 | 2016-01-28 13:18:37.283992+00 | 2016-01-28 13:18:37.283992+00 | f       | se

lect * from pg_stat_activity where current_query like '%IDLE%';

 16393 | ams     |    6719 |   213867 | ams_slony  | slon.origin_2_provider_2  | 60.254.150.133 |                 |       61806 | 2016-01-22 01:59:14.800129+00 |                               | 2016-01-28 13:18:25.935111+00 | f       | <I

DLE>

 16393 | ams     |    6718 |   213867 | ams_slony  | slon.origin_3_provider_2  | 60.254.150.133 |                 |       61805 | 2016-01-22 01:59:14.797655+00 |                               | 2016-01-28 13:18:34.304475+00 | f       | <I

DLE>

 16393 | ams     |    5505 |   213867 | ams_slony  | slon.origin_4_provider_2  | 80.67.75.105   |                 |       36477 | 2016-01-22 01:56:25.637046+00 |                               | 2016-01-28 13:18:36.1348+00   | f       | <I

DLE>

 16393 | ams     |    5504 |   213867 | ams_slony  | slon.origin_3_provider_2  | 72.246.50.22   |                 |       51813 | 2016-01-22 01:56:25.240798+00 |                               | 2016-01-28 13:18:28.961629+00 | f       | <I

DLE>

 16393 | ams     |    5487 |   213867 | ams_slony  | slon.origin_4_provider_2  | 72.246.50.22   |                 |       51803 | 2016-01-22 01:56:22.896388+00 |                               | 2016-01-28 13:18:35.858913+00 | f       | <I

DLE>

 16393 | ams     |    5047 |   213867 | ams_slony  | slon.origin_2_provider_2  | 72.246.50.22   |                 |       51564 | 2016-01-22 01:55:23.600296+00 |                               | 2016-01-28 13:18:34.487192+00 | f       | <I

DLE>

 16393 | ams     |    5041 |   213867 | ams_slony  | slon.origin_2_provider_2  | 80.67.75.105   |                 |       36402 | 2016-01-22 01:55:22.964462+00 |                               | 2016-01-28 13:18:34.519066+00 | f       | <I

DLE>

 16393 | ams     |    6694 |   213867 | ams_slony  | slon.node_2_listen        | 60.254.150.133 |                 |       61795 | 2016-01-22 01:59:12.095052+00 |                               | 2016-01-28 13:18:27.928384+00 | f       | <I

DLE>

 16393 | ams     |    4456 |   213867 | ams_slony  | slon.node_2_listen        | 72.246.50.22   |                 |       51238 | 2016-01-22 01:54:21.481355+00 |                               | 2016-01-28 13:18:36.766973+00 | f       | <I

DLE>

 16393 | ams     |    4457 |   213867 | ams_slony  | slon.node_2_listen        | 80.67.75.105   |                 |       36333 | 2016-01-22 01:54:21.500456+00 |                               | 2016-01-28 13:18:36.204482+00 | f       | <I

DLE>

 16393 | ams     |    4428 |   213867 | ams_slony  | slon.local_monitor        |                |                 |          -1 | 2016-01-22 01:54:18.977015+00 |                               | 2016-01-28 13:18:36.652567+00 | f       | <I

DLE>

 16393 | ams     |    4427 |   213867 | ams_slony  | slon.local_sync           |                |                 |          -1 | 2016-01-22 01:54:18.976932+00 |                               | 2016-01-28 13:18:36.151998+00 | f       | <I

DLE>

 16393 | ams     |    4426 |   213867 | ams_slony  | slon.local_cleanup        |                |                 |          -1 | 2016-01-22 01:54:18.976842+00 |                               | 2016-01-28 13:12:12.582921+00 | f       | <I

DLE>

 16393 | ams     |    4425 |   213867 | ams_slony  | slon.remoteWorkerThread_4 |                |                 |          -1 | 2016-01-22 01:54:18.976783+00 |                               | 2016-01-28 13:18:33.99715+00  | f       | <I

DLE>

 16393 | ams     |    4420 |   213867 | ams_slony  | slon.remoteWorkerThread_1 |                |                 |          -1 | 2016-01-22 01:54:18.976548+00 |                               | 2016-01-28 13:18:33.561531+00 | f       | <I

DLE>

 16393 | ams     |    4419 |   213867 | ams_slony  | slon.remoteWorkerThread_3 |                |                 |          -1 | 2016-01-22 01:54:18.97647+00  |                               | 2016-01-28 13:18:34.808907+00 | f       | <I

DLE>

 16393 | ams     |    4413 |   213867 | ams_slony  | slon.local_listen         |                |                 |          -1 | 2016-01-22 01:54:18.965568+00 |                               | 2016-01-28 13:18:37.096159+00 | f       | <I

DLE>

(21 rows)


ams=# 



Tom    :-)


_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Frank McGeough | 21 Jan 16:42 2016
Picon
Gravatar

duplicate key value violates unique constraint?

I’m having intermittent issues that result in the message "duplicate key value violates unique
constraint” in the slony logs. Here is my configuration :

1 master
2 slaves
postgresql 9.3.5 on all servers
slony2.2.2

All the servers are in the same network and running the same Linux o/s with the same file system. The issues
just seem to happen randomly. They happen primarily on one table but have occurred on two different tables
up to this point (the tables are logically related in the database - one is populated by events on the other
one, the one that has most of the issues is the one with the trigger on it that populates the other table).
When the error occurs it may occur on one slave or the other. Once on both.

These incidents have happened 6 times in the past week. Prior to this issue we’ve been running the same
setup for many months without incident.

What I do to correct this is to actually remove the entry that is causing the duplicate key issue from the
sl_log table. Replication picks back up again. When I compare the rows on all servers I see that they are the
same. In all cases when the servers are back in sync the row is actually gone from all servers. 

There is no obvious corruption issue on the slave boxes. I’m not sure what else to check at this point. Are
there suggestions from the slony developers on how to investigate what is actually going on to cause these
issues? Any help is appreciated at this point. 

sample error :

2016-01-21 01:19:01 UTC ERROR  remoteWorkerThread_1_1: error at end of COPY IN: ERROR:  duplicate key
value violates unique constraint "device_properties_pkey"
DETAIL:  Key (device_guid, property_guid, data_index)=(26464008, 39, 0) already exists.
CONTEXT:  SQL statement "INSERT INTO "device"."device_properties" ("device_guid", "property_guid",
"data_index", "property_value", "tran_id", "dt_last_updated", "last_updated_by_userid",
"version_id") VALUES ($1, $2, $3, $4, $5, $6, $7, $8);"
COPY sl_log_1, line 36: "1      6117236903      298     7797340846      device  device_properties       I       0       {device_guid,26464008,property_guid,39,data…"

thanks, 
Frank

_______________________________________________
Slony1-general mailing list
Slony1-general <at> lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general
Tory M Blue | 14 Jan 02:23 2016
Picon

Slon blocked by an index on a non replicated table?? 2.2.3

Postgres 9.4.5 slon 2.2.3 CentOS 6.6

Afternoon

Based on some searching this is not new, but I have failed to find an answer for it.. I had an index running on a non replicated table but slon would not initialize due to that index, something about set 1 failed due to a previous PID.

Now I'm trying to to reindex 2 tables that are not in the slon relationship, but slon is backing up, once I stop the reindex, slon replicates just fine (immediately). Something is wonky (its a technical term, you can look it up!: ) )

So killing the index, I was able to add my host, but now I'm trying to add another 2 indexes and it's backing up slon, and or one is holding up the other.

So now I'm trying to create 2 indexes concurrently...

These are the 2 i am creating. 

One is waiting  on the other. (Waiting is true in pg_stat_Activity)

If I kill the 1st the second starts. Nothing is querying this table.


CREATE INDEX CONCURRENTLY idx_impsarchive_psttstamp

  ON torque.impressions_archive

  USING btree

  (timezone('US/Pacific'::text, timezone('UTC'::text, log_tstamp)))

TABLESPACE torquespace;


-- Index: torque.idx_impsarchive_system


-- DROP INDEX torque.idx_impsarchive_system;


CREATE INDEX CONCURRENTLY idx_impsarchive_system

  ON torque.impressions_archive

  USING btree

  (system COLLATE pg_catalog."default")

TABLESPACE torquespace;


Also when I had it running non-concurrently they both ran at the same time. But slony got backed up. It didn’t explicitly say that it was waiting. But after about 45 minutes, data had not replicated. I killed the indices and it replicated immediately. This table is not being replicated. 


I also queried sl_table on the master and this table is not there


 tab_id | tab_reloid |     tab_relname     | tab_nspname | tab_set |       tab_idxname        | tab_altered |        tab

_comment         

--------+------------+---------------------+-------------+---------+--------------------------+-------------+-----------

-----------------

    225 |      17959 | variables           | torque      |       2 | pk_variables_name        | f           | replicatedtable

    226 |      17935 | impsbulkimporteven  | torque      |       2 | pk_impsbulkimporteven_id | f           | torque.impsbulkimporteven

    228 |      17947 | impsbulkimportodd   | torque      |       2 | pk_impsbulkimportodd_id  | f           | torque.impsbulkimportodd

    233 |      17919 | impressions_daily | torque      |       2 | pk_impsdaily_id        | f           | torque.impressions_daily

    235 |      17907 | impressions       | torque      |       2 | pk_impressions_id      | f           | torque.impressions

(5 rows)


So I want to create these indexes, they are not being replicated but they completely freeze replication. Same thing happened when I added a node, slon would not initialize until we killed the same index creation on this same non replicated table. I'm not sure what is going on and my apologies i deleted the set 1 failed, due to a previous pid message, but if this table is not in replication, why is it causing slon to be blocked?

Thanks
Tory

_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
David Fetter | 13 Jan 19:43 2016
Gravatar

Make all slon options actually settable from the command line?

Folks,

Is there a reason other than lack of tuits why $Subject is not already
the case?  If not, I'd like to prepare a patch for the next minor
release to address it.

Cheers,
David.
--

-- 
David Fetter <david@...> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@...

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Krzysztof Jakowczyk | 12 Jan 17:21 2016
Picon
Gravatar

log insert/update query executed on subscriber

Hello,

Is it possible to log query with parameters executed on subscriber node?
I've tried to add __audit trigger before insert or update, but when
denyaccess trigger is enabled, nothing is logged. Below is my try for
table emp:

CREATE SCHEMA
audit;                                                                                                                                                  

CREATE TABLE audit.slon_audit (
    operation         char(1)   NOT NULL,
    stamp             timestamp NOT NULL,
    userid            text      NOT NULL,
    query             text      NOT NULL
);

CREATE OR REPLACE FUNCTION save_query() RETURNS TRIGGER AS $$
    BEGIN
        IF (TG_OP = 'DELETE') THEN
            INSERT INTO audit.slon_audit SELECT 'D', now(), user, OLD.*;
            RETURN OLD;
        ELSIF (TG_OP = 'UPDATE') THEN
            INSERT INTO audit.slon_audit SELECT 'U', now(), user,
current_query();
            RETURN NEW;
        ELSIF (TG_OP = 'INSERT') THEN
            INSERT INTO audit.slon_audit SELECT 'I', now(), user,
current_query();
            RETURN NEW;
        END IF;
        RETURN NULL;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER __audit
BEFORE INSERT OR UPDATE OR DELETE ON emp
    FOR EACH ROW EXECUTE PROCEDURE save_query();

--

-- 
Pozdrawiam,

Krzysztof Jakowczyk
Administrator Systemów Unix

Grupa Unity | ul. Przedmiejska 6-10, 54-201 Wrocław
ul. Conrada 55B, 31-357 Kraków | ul. Złota 59, 00-120 Warszawa

_______________________________________________
Slony1-general mailing list
Slony1-general <at> lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general
Tory M Blue | 7 Jan 23:26 2016
Picon

Slony-I: log switch to sl_log_2 still in progress - sl_log_1 not truncated


So I'm backing up in a big way. I know what started it, "adding a new insert slave which took 13 hours to complete (indexes etc)".. But now it doesn't appear I am able to catch up. I see the slave doing what it's suppose to, get a bunch of data, truncate the sl_log files move on. But the master is having a hard time.

Postgres 9.4.5 and Slony 2.2.3

All other nodes don't have any errors or issues.

this is Node 1 (the master)
node 2 is a slave
node 3-5 are query slaves with only 1 of 3 sets being replicated too.

I have interval at 5 minutes and sync_group_maxsize=50

Any suggestions on where to thump it. At some point this will cause issues on my master and when I see that starting, I'll have to drop node 2 again, and when i add it, it will take 13+ hours and I'll be back in the same position :)

Thanks
Tory



Node:  Old Transactions Kept Open
================================================
Old Transaction still running with age 01:48:00 > 01:30:00

Query: autovacuum: VACUUM


Node: 0 threads seem stuck
================================================
Slony-I components have not reported into sl_components in interval 00:05:00

Perhaps slon is not running properly?

Query:
     select co_actor, co_pid, co_node, co_connection_pid, co_activity, co_starttime, now() - co_starttime, co_event, co_eventtype
     from "_admissioncls".sl_components
     where  (now() - co_starttime) > '00:05:00'::interval
     order by co_starttime;
  


Node: 1 sl_log_1 tuples = 219700 > 200000
================================================
Number of tuples in Slony-I table sl_log_1 is 219700 which
exceeds 200000.

You may wish to investigate whether or not a node is down, or perhaps
if sl_confirm entries have not been propagating properly.


Node: 1 sl_log_2 tuples = 1.74558e+07 > 200000
================================================
Number of tuples in Slony-I table sl_log_2 is 1.74558e+07 which
exceeds 200000.

You may wish to investigate whether or not a node is down, or perhaps
if sl_confirm entries have not been propagating properly.


Node: 2 sl_log_2 tuples = 440152 > 200000
================================================
Number of tuples in Slony-I table sl_log_2 is 440152 which
exceeds 200000.

You may wish to investigate whether or not a node is down, or perhaps
if sl_confirm entries have not been propagating properly.

_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general

Gmane