Scott Marlowe | 18 Dec 05:35 2014
Picon

Truncate ONLY propagates as only Truncate. PG 9.2.9/Slony 2.2.1

So as the title states, we're running pg 9.2.9 and slony 2.2.1

If you create a parent / child table with inherit, then truncate only
the parent table, it will work properly on the source but will
propagate as a regular truncate on the destinations.

create table parent yada;
create table chlid inherits parent yada;
insert rows into parent, and rows into child tables

subscribe parent and child to another node...

on source run "truncate only parent" and note that child table there
still has its own rows. Check on subscriber and children tables are
now empty.

1: Is this a known bug?
2: Is it planned to fix it?

We were bitten by it quite by surprise and lost a lot of data in child
tables. Luckily we noticed before switching over from the master node
to another and resubed the child tables.

--

-- 
To understand recursion, one must first understand recursion.
Carlos Henrique Reimer | 17 Dec 14:48 2014
Picon

WARNING: out of shared memory

Hi,

We are trying to upgrade our PG from 8.3 to 9.3 with slony 2.2.3.

The complete database (21.000 tables) is now being sincronized with SLONY and I would like to terminate the replication processing in order to another team test the application with the new 9.3 replicated database. They will need several hours to test the application. Once I get they green light, will repeat the processing again and replicate the database from scratch.

I'm trying to unsubscribe a receiver from the master set but I'm getting an out of shared memory message:

cat unsubscribe2.sl
cluster name = slcluster;
node 1 admin conninfo = 'dbname=FiscalWeb host=192.168.23.10 user=slonyo';
node 2 admin conninfo = 'dbname=FiscalWeb host=192.168.23.11 user=slonyn';
unsubscribe set ( id = 1 , receiver = 2);

slonik < unsubscribe2.sl
<stdin>:4: WARNING:  out of shared memory
CONTEXT:  SQL statement "drop trigger "_slcluster_logtrigger" on "8359_wsn"."tbpgdas01502""
PL/pgSQL function altertabledroptriggers(integer) line 47 at EXECUTE statement
SQL statement "SELECT "_slcluster".alterTableDropTriggers(v_tab_row.tab_id)"
PL/pgSQL function unsubscribeset(integer,integer,boolean) line 49 at PERFORM
<stdin>:4: PGRES_FATAL_ERROR lock table "_slcluster".sl_event_lock, "_slcluster".sl_config_lock;select "_slcluster".unsubscribeSet(1, 2,false);  - ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.
CONTEXT:  SQL statement "drop trigger "_slcluster_logtrigger" on "8359_wsn"."tbpgdas01502""
PL/pgSQL function altertabledroptriggers(integer) line 47 at EXECUTE statement
SQL statement "SELECT "_slcluster".alterTableDropTriggers(v_tab_row.tab_id)"
PL/pgSQL function unsubscribeset(integer,integer,boolean) line 49 at PERFORM
[postgres <at> 00002-NfseNet-SGDB reimer]$ ^C

max_locks_per_transaction is set to 255. Changed to 4096 and will restart the database during the night change window.

Tried to repeat the process again and now I'm getting always the same error:
[postgres <at> 00002-NfseNet-SGDB reimer]$ slonik < unsubscribe2.sl
<stdin>:4: PGRES_FATAL_ERROR lock table "_slcluster".sl_event_lock, "_slcluster".sl_config_lock;select "_slcluster".unsubscribeSet(1, 2,false);  - ERROR:  deadlock detected
DETAIL:  Process 15366 waits for AccessExclusiveLock on relation 29564 of database 16384; blocked by process 14994.
Process 14994 waits for RowExclusiveLock on relation 84222 of database 16384; blocked by process 15366.
HINT:  See server log for query details.
CONTEXT:  SQL statement "lock table "7481_spunico"."sincdc" in access exclusive mode"
PL/pgSQL function altertabledroptriggers(integer) line 42 at EXECUTE statement
SQL statement "SELECT "_slcluster".alterTableDropTriggers(v_tab_row.tab_id)"
PL/pgSQL function unsubscribeset(integer,integer,boolean) line 49 at PERFORM
[postgres <at> 00002-NfseNet-SGDB reimer]$

The two pids reported by the deadlock message are probably temporary processed created by SLONY as I did not find them in the system.

The increase in the max_locks_per_transactions and the server restart will fix this issue?

Thank you!

--
Reimer
47-3347-1724 47-9183-0547 msn: carlos.reimer-AIy3apnC4nVwFqzsTH5u/w@public.gmane.org
_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Carlos Henrique Reimer | 16 Dec 11:32 2014
Picon

Duplicate key while merging temporary to main set

Hi,

I'm trying to replicate a postgresql database 8.3 with 1TB data and about 20.000 tables using slony 2.2.3 but I'm getting the following error in the middle of the process:

I created a main set with id number 1 and then small temporary sets were created and merged to the main set.

Currently there are about 15.000 tables being replicated in the main set and when I was adding another small set I got this error during the MERGE command:

2014-12-16 07:08:25 BRST CONFIG remoteWorkerThread_1: Begin COPY of table "8357_isarq"."wm_bco_pla"
NOTICE:  truncate of "8357_isarq"."wm_bco_pla" succeeded
2014-12-16 07:08:25 BRST CONFIG remoteWorkerThread_1: 74168 bytes copied for table "8357_isarq"."wm_bco_pla"
NOTICE:  Slony-I: Logswitch to sl_log_1 initiated
CONTEXT:  SQL statement "SELECT "_slcluster".logswitch_start()"
PL/pgSQL function cleanupevent(interval) line 97 at PERFORM
2014-12-16 07:08:25 BRST INFO   cleanupThread:   89.011 seconds for cleanupEvent()
2014-12-16 07:08:25 BRST ERROR  remoteWorkerThread_1: "select "_slcluster".finishTableAfterCopy(15564); analyze "8357_isarq"."wm_bco_pla"; " PGRES_FATAL_ERROR ERROR:  could not create unique index "pk_wm_bco_pla"
DETAIL:  Key (bco_cod, pla_cod)=(399, 5056110) is duplicated.
CONTEXT:  SQL statement "reindex table "8357_isarq"."wm_bco_pla""
PL/pgSQL function finishtableaftercopy(integer) line 26 at EXECUTE statement
2014-12-16 07:08:25 BRST WARN   remoteWorkerThread_1: data copy for set 999 failed 108 times - sleep 60 seconds

The error indicates "8357_isarq"."wm_bco_pla" has duplicated rows for primary key  (bco_cod, pla_cod)=(399, 5056110) but when I query the table using this pk I got only one row:

select * from "8357_isarq"."wm_bco_pla" where bco_cod=399 and pla_cod = '5056110';
 bco_cod | pla_cod |                pla_des                | ativo | conta_pai | pla_itlistserv | pla_ctacosif
---------+---------+---------------------------------------+-------+-----------+----------------+--------------
     399 | 5056110 | RENDAS TRANSACOES VISA ELETRON - HBBR | S     | 5056004   |           1501 | 71799003
(1 row)

Slony is trying to merge the temporary set to the main set every 60 seconds and getting this error. I do not want to loose three days of replication processing.

Is there anything I can do to fix this error or at least can I remove this table from the temporary set and work on this issue afterwards?

Thank you!


_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Tory M Blue | 16 Dec 07:35 2014
Picon

PGRES_FATAL_ERROR ERROR: could not access file "$libdir/slony1_funcs.2.2.3": No such file or directory

Trying a production upgrade, after upgrade 3 different environments and I'm getting this error.

PGRES_FATAL_ERROR ERROR:  could not access file "$libdir/slony1_funcs.2.2.3": No such file or directory

The files are there, anyway to force this? even tried to create a export libdir with no success.

Thanks
Tory

_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Tory M Blue | 11 Dec 20:36 2014
Picon

Slonik lock set, verifying


Just want to verify that a slonik lock set during the slon upgrade, only causes locking for updates, and that  my read and selects are unaffected?

Just not 100%, probably a dumb question

Thanks
Tory
_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
dustin kempter | 5 Dec 22:36 2014

slony 2.2 with postgres 9.3 and 8.4?

Hi all, were trying to get slony 2.2.3 working between 2 nodes. 1 with 
postgres 9.3 installed. and another with 8.4. But we keep getting this error

<stdin>:506: PGRES_FATAL_ERROR load '$libdir/slony1_funcs.2.2.3'; - 
ERROR: could not access file "$libdir/slony1_funcs.2.2.3": No such file 
or directory
<stdin>:506: Error: the extension for the Slony-I C functions cannot be 
loaded in database 'dbname=tigris port=5432 host=192.168.2.59 user=postgres'

ive added the slon bin dir to my path but it still seems to have issues. 
any help would be great thanks!
Waldo Nell | 5 Dec 21:33 2014

Slony-I 2.2.3 fails on execute script

I have a working Slony-I setup with one master and one slave.  I tried to run a simple alter table add column
statement via execute script.  This always worked for me, however today it failed.

I got this error (truncated) on the slave:

2014-12-05 21:51:14 SAST ERROR  remoteWorkerThread_1_1: error at end of COPY IN: ERROR:  schema
"_appcluster" does not exist
CONTEXT:  SQL statement "select _AppCluster.sequenceSetValue($1,$2,NULL,$3,true); "
COPY sl_log_1, line 1: "1       4065328 \N      13346355        \N      \N      S       \N      {"alter table future_details add note_adm text;"..."
2014-12-05 21:51:14 SAST ERROR  remoteWorkerThread_1_1: failed SYNC's log selection query was 'COPY (
select log_origin, log_txid, NULL::integer, log_actionseq, NULL::text, NUL
L::text, log_cmdtype, NULL::integer, log_cmdargs from "_AppCluster".sl_log_script where
log_origin = 1 and log_txid >= "pg_catalog".txid_snapshot_xmax('4065328:4065330:406532
8') and log_txid < '4065334' and "pg_catalog".txid_visible_in_snapshot(log_txid,
'4065334:4065334:') union all select log_origin, log_txid, NULL::integer, log_actionseq, NULL::
...

Why am I getting this error that the schema does not exist?  If I insert data into a replicated table on the
master, it appears correctly on the slave.  

On the slave:

oasis=# \dn
    List of schemas
    Name      |  Owner   
---------------+----------
_AppCluster | postgres
public        | postgres
(2 rows)

Can anyone provide advise?

Thanks
Waldo
David Fetter | 2 Dec 19:18 2014

RDS PostgreSQL?

Folks,

Is there any way to make an Amazon RDS PostgreSQL database a Slony node?

Cheers,
David.
--

-- 
David Fetter <david@...> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@...

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
上原 一樹 | 28 Nov 01:36 2014
Picon

Why helper-threads log is remained?

Hi,

Please tell me.
I have understood that helper threads was removed at 2.1.4 → 2.2.0.
Why this log is remained here?

slony1-2.2.0/src/slon/remote_worker.c

1695         if (provider->set_head == NULL)
1696         {
1697             /*
1698              * Tell this helper thread to exit, join him and destroy thread
1699              * related data.
1700              */
1701             slon_log(SLON_CONFIG, "remoteWorkerThread_%d: "
1702                      "helper thread for provider %d terminated\n",
1703                      node->no_id, provider->no_id);
...

Does this slon_log only inform user about whether provider is empty?

regards,
uehara

--

-- 
上原 一樹
NTT OSSセンタ DBMS担当
Mail : uehara.kazuki@...
Phone: 03-5860-5115
Granthana Biswas | 30 Oct 09:59 2014
Picon

Repeated Slony-I: cleanup stale sl_nodelock entry for pid on subscribing new set from node 3

Hi All,

My replication setup is as  db1 -> db2 -> db3.

On adding a new set to the cluster, the merge from node 3 is going on waiting state for node 3 to subscribe. Because of this, node 3 is lagging behind.

These are the slonik commands that I used to add new set, subscribe and merge:
---------------------------------------------------------------------------------------------------------
create set ( id = 2, origin = 1, comment = 'replication set for surcharge table');

set add table (set id = 2, origin = 1, id = 1744, fully qualified name = 'public.t2', comment = 't2 table');
set add sequence (set id = 2, origin = 1, id = 1756, fully qualified name = 'public.s2', comment = 's2 sequence');

subscribe set ( id = 2, provider = 1, receiver = 2, forward = yes );
subscribe set ( id = 2, provider = 2, receiver = 3, forward = yes );

merge set ( id = 1, add id = 2, origin = 1);
-----------------------------------------------------------------------------------------------------------

Even though it goes it waiting mode, the sl_subscribe table shows the following:

 sub_set | sub_provider | sub_receiver | sub_forward | sub_active 
---------+--------------+--------------+-------------+------------
       1 |            1 |            2 | t           | t
       1 |            2 |            3 | t           | t
       2 |            1 |            2 | t           | t
       2 |            2 |            3 | t           | t


But the slony log on node 3 shows the following repeatedly:

NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=29117
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=30115
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=30116
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=30414
NOTICE:  Slony-I: Logswitch to sl_log_2 initiated
CONTEXT:  SQL statement "SELECT  "_cluster".logswitch_start()"
PL/pgSQL function "cleanupevent" line 96 at PERFORM
NOTICE:  truncate of <NULL> failed - doing delete
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=31364
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=31369
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=31368
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=32300
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=1117
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=1149
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=1186
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=1247
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=1270
NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=1294


It is continuously trying to cleanup stale nodelock entry for nl_nodeid=3 and nl_conncnt=0.

I tried stopping and starting the slon process for node 3 which didn't help. I don't see any errors in the other slony log files.

Do I have to stop all slon processes of all nodes and start again?

Regards,
Granthana
_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general
Tory M Blue | 29 Oct 13:50 2014
Picon

finishTableAfterCopy(143); long running "index?"

So I've been watching paint dry since before midnight, it's now 5:44am PST. I have no idea whether slon is actually doing anything at this point. My slave node (did an add), is quite large the table it's working on was 52GB when it completed the transfer and is now 77GB, but has been that size for over 2 hours (on disk du)

Just not sure if it's stuck or it's actually doing anything. I see this in my stats table, and i have a single cpu pegged at 99-100% that has been running for well;

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                         

 54600 postgres  20   0 4617m 4.4g 2.0g R 99.7  1.7 399:57.68 postmaster                                                                                                                                                                       



16398 | clsdb | 54600 |       10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 |                 |       54260 | 
2014-10-28 22:19:29.277022-07 | 2014-10-29 00:05:40.884649-07 | 2014-10-29 01:17:22.102619-07 | 2014-10-29 01:17:22.10262-07  | f   
    | active | select "_cls".finishTableAfterCopy(143); analyze "torque"."iimpressions"; 

I now have 5million rows backed up on the master.I'm watching to see if anything negative happens, where I have to drop this node, just so the master can truncate all that data and then start again. I'd rather not do that if I come to believe this is still working and it may finish in the next couple of hours?

I've set maintenance_work_mem to 10GB to try to give this some room, but no changes.  This is a single process, but it's an index creation, one would think it could use the 256GB of ram to it's advantage.

Anyways is there somewhere I can look in slon /postgres  to see if it's doing anything more then sticking a cpu to 100% and making me crave sleep?

Thanks
Tory
_______________________________________________
Slony1-general mailing list
Slony1-general@...
http://lists.slony.info/mailman/listinfo/slony1-general

Gmane