Jeff Davis | 2 Jul 2007 01:05

Re: Soliciting ideas for v2.0

On Fri, 2007-06-29 at 23:57 -0700, Andrew Hammond wrote:
> A really interesting win would be in detecting cases where you can go from
> 
> WHERE id IN ( a list )
> 
> to
> 
> WHERE a < id AND id < b
> 
> However I think this is only possible at the time the transaction
> happens (how else will you know if your sequence is contigious. And
> that suggests to me that it's not reasonable to do at this time.
> 

If we move the data from the provider to a temp table on the receiver,
we could also use an IN query rather than a range. I don't know when
this would be a win, but it seems like it would be useful in some cases.

A range is much nicer, but like you say, it's harder to detect in a
deterministic way.

> Also, ISTM that the big reason we don't like statement based
> replication is that SQL has many non-deterministic aspects. However,
> there is probably a pretty darn big subset of SQL which is provably
> non-deterministic. And for that subset, would it be any less rigorous
> to transmit those statements than to transmit the per-row change
> statments like we currently do?
> 

The pgpool guys have done a lot of research on statement replication
(Continue reading)

David Fetter | 2 Jul 2007 17:38
Gravatar

Re: Soliciting ideas for v2.0

On Thu, Jun 28, 2007 at 06:17:25PM -0400, Christopher Browne wrote:
> The v2.0 branch has already taken on the following fairly significant
> changes (listed below).  Jan and I had a chat this afternoon about
> some items still to do:

[snip]

> - Clone Node - to allow using pg_dump/PITR to populate a new
>   subscriber node.
> 
> Are there more items we should try to add?

It would be really great to be able to promote a PITR node to a
subscriber, or better still, to be able to use pg_dump + some kind of
ancillary information to do same.  This would help a lot with the
initial sync problem that makes replicating large databases so
painful.

Cheers,
D
--

-- 
David Fetter <david@...> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Jan Wieck | 2 Jul 2007 19:32
Picon
Favicon

Re: Soliciting ideas for v2.0

On 7/2/2007 11:38 AM, David Fetter wrote:
> On Thu, Jun 28, 2007 at 06:17:25PM -0400, Christopher Browne wrote:
>> The v2.0 branch has already taken on the following fairly significant
>> changes (listed below).  Jan and I had a chat this afternoon about
>> some items still to do:
> 
> [snip]
> 
>> - Clone Node - to allow using pg_dump/PITR to populate a new
>>   subscriber node.
>> 
>> Are there more items we should try to add?
> 
> It would be really great to be able to promote a PITR node to a
> subscriber, or better still, to be able to use pg_dump + some kind of
> ancillary information to do same.  This would help a lot with the
> initial sync problem that makes replicating large databases so
> painful.

You are aware that only PITR will avoid a long running transaction?

The stuff I am currently (very slowly) working on is that very problem. 
Any long running transaction causes that the minxid in the SYNC's is 
stuck at that very xid during the entire runtime of the LRT. The problem 
with that is that the log selection in the slon worker uses an index 
scan who's only index scankey candidates are the minxid of one and the 
maxxid of another snapshot. That is the range of rows returned by the 
scan itself. Since the minxid is stuck, it will select larger and larger 
groups of log tuples only to filter out most of them on a higher level 
in the query via xxid_le_snapshot().
(Continue reading)

Marko Kreen | 2 Jul 2007 19:45
Picon

Re: Soliciting ideas for v2.0

On 7/2/07, Jan Wieck <JanWieck@...> wrote:
> The stuff I am currently (very slowly) working on is that very problem.
> Any long running transaction causes that the minxid in the SYNC's is
> stuck at that very xid during the entire runtime of the LRT. The problem
> with that is that the log selection in the slon worker uses an index
> scan who's only index scankey candidates are the minxid of one and the
> maxxid of another snapshot. That is the range of rows returned by the
> scan itself. Since the minxid is stuck, it will select larger and larger
> groups of log tuples only to filter out most of them on a higher level
> in the query via xxid_le_snapshot().

How the LRT problem is avoided in PGQ:

http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/skytools/skytools/sql/pgq/functions/pgq.batch_event_sql.sql?rev=1.2&content-type=text/x-cvsweb-markup

Basic idea is that there are only few LRT's, so its reasonable
to pick up bottom half of range by event txid, one-by-one.

--

-- 
marko
Jan Wieck | 2 Jul 2007 20:03
Picon
Favicon

Re: Soliciting ideas for v2.0

On 7/2/2007 1:45 PM, Marko Kreen wrote:
> On 7/2/07, Jan Wieck <JanWieck@...> wrote:
>> The stuff I am currently (very slowly) working on is that very problem.
>> Any long running transaction causes that the minxid in the SYNC's is
>> stuck at that very xid during the entire runtime of the LRT. The problem
>> with that is that the log selection in the slon worker uses an index
>> scan who's only index scankey candidates are the minxid of one and the
>> maxxid of another snapshot. That is the range of rows returned by the
>> scan itself. Since the minxid is stuck, it will select larger and larger
>> groups of log tuples only to filter out most of them on a higher level
>> in the query via xxid_le_snapshot().
> 
> How the LRT problem is avoided in PGQ:
> 
> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/skytools/skytools/sql/pgq/functions/pgq.batch_event_sql.sql?rev=1.2&content-type=text/x-cvsweb-markup
> 
> Basic idea is that there are only few LRT's, so its reasonable
> to pick up bottom half of range by event txid, one-by-one.

Hmmm, that is an interesting idea. And it is (in contrast to what I've 
been playing with) node insensitive, since it doesn't need info only 
available on the event origin, like CLOG. Thanks.

Jan

--

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#==================================================
(Continue reading)

Jan Wieck | 2 Jul 2007 21:49
Picon
Favicon

Re: Soliciting ideas for v2.0

On 7/2/2007 2:03 PM, Jan Wieck wrote:
> On 7/2/2007 1:45 PM, Marko Kreen wrote:
>> On 7/2/07, Jan Wieck <JanWieck@...> wrote:
>>> The stuff I am currently (very slowly) working on is that very problem.
>>> Any long running transaction causes that the minxid in the SYNC's is
>>> stuck at that very xid during the entire runtime of the LRT. The problem
>>> with that is that the log selection in the slon worker uses an index
>>> scan who's only index scankey candidates are the minxid of one and the
>>> maxxid of another snapshot. That is the range of rows returned by the
>>> scan itself. Since the minxid is stuck, it will select larger and larger
>>> groups of log tuples only to filter out most of them on a higher level
>>> in the query via xxid_le_snapshot().
>> 
>> How the LRT problem is avoided in PGQ:
>> 
>> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/skytools/skytools/sql/pgq/functions/pgq.batch_event_sql.sql?rev=1.2&content-type=text/x-cvsweb-markup
>> 
>> Basic idea is that there are only few LRT's, so its reasonable
>> to pick up bottom half of range by event txid, one-by-one.
> 
> Hmmm, that is an interesting idea. And it is (in contrast to what I've 
> been playing with) node insensitive, since it doesn't need info only 
> available on the event origin, like CLOG. Thanks.

Not only is it interesting, but it is astonishing simple to adopt into 
our code. I want to do some more testing before I commit this change, 
but the really interesting thing here is that it is only a 3 line change 
in the remote_worker.c file, which could easily be backported into 1.2.

I had created a really pathetic test case here by SIGSTOP'ing the slon 
(Continue reading)

Christopher Browne | 2 Jul 2007 21:53

Re: Soliciting ideas for v2.0

Jan Wieck <JanWieck@...> writes:
> On 7/2/2007 2:03 PM, Jan Wieck wrote:
>> On 7/2/2007 1:45 PM, Marko Kreen wrote:
>>> On 7/2/07, Jan Wieck <JanWieck@...> wrote:
>>>> The stuff I am currently (very slowly) working on is that very problem.
>>>> Any long running transaction causes that the minxid in the SYNC's is
>>>> stuck at that very xid during the entire runtime of the LRT. The problem
>>>> with that is that the log selection in the slon worker uses an index
>>>> scan who's only index scankey candidates are the minxid of one and the
>>>> maxxid of another snapshot. That is the range of rows returned by the
>>>> scan itself. Since the minxid is stuck, it will select larger and larger
>>>> groups of log tuples only to filter out most of them on a higher level
>>>> in the query via xxid_le_snapshot().
>>> How the LRT problem is avoided in PGQ:
>>> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/skytools/skytools/sql/pgq/functions/pgq.batch_event_sql.sql?rev=1.2&content-type=text/x-cvsweb-markup
>>> Basic idea is that there are only few LRT's, so its reasonable
>>> to pick up bottom half of range by event txid, one-by-one.
>> Hmmm, that is an interesting idea. And it is (in contrast to what
>> I've been playing with) node insensitive, since it doesn't need info
>> only available on the event origin, like CLOG. Thanks.
>
> Not only is it interesting, but it is astonishing simple to adopt into
> our code. I want to do some more testing before I commit this change,
> but the really interesting thing here is that it is only a 3 line
> change in the remote_worker.c file, which could easily be backported
> into 1.2.
>
> I had created a really pathetic test case here by SIGSTOP'ing the slon
> while doing the copy_set() for a day, so it had some 90000 events
> backlog. About a third into that backlog, it was down to 60+ seconds
(Continue reading)

Andrew Hammond | 3 Jul 2007 01:50
Picon
Gravatar

Re: Soliciting ideas for v2.0

> Drew Hammond's keen on having some BSD-oriented scripts put into the
> 1.2 branch that I had only put into HEAD; this might be an excuse for
> a 1.2.11.

s/BSD/djb daemontools/

Andrew
Bill Moran | 3 Jul 2007 04:54
Favicon

timestamp with time zone insanity


Postgresl 8.1.9 replicating to 8.1.8 using Slony 1.2.6.

I have a number of "timestamp with time zone" columns that simply will
not replicate correctly and I'm stumped to the point of insanity as
to why.

On both systems, the database is set to est5edt timezone, yet some
rows (not all, mind you) change the timezone from edt to est when
the data is replicated.

It's always the same rows, but not all rows in the table.

This has obviously got something to do with the data, but I'm befuddled
as to what to do about it, or even how to determine the exact nature
of the problem.  New rows are being added to this table all the time
and they are replicated correctly, it seems as if it's always the same
rows that magically switch timezones during replication.

I've gone so far as to completely deinstall Slony from this database and
rebuild the Slony config from scratch, and it still does the exact same
thing.

Any help will be met with great appreciation.

--

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

(Continue reading)

Jan Wieck | 3 Jul 2007 05:15
Picon
Favicon

Re: Soliciting ideas for v2.0

On 7/2/2007 3:53 PM, Christopher Browne wrote:
> Jan Wieck <JanWieck@...> writes:
>> On 7/2/2007 2:03 PM, Jan Wieck wrote:
>>> On 7/2/2007 1:45 PM, Marko Kreen wrote:
>>>> On 7/2/07, Jan Wieck <JanWieck@...> wrote:
>>>>> The stuff I am currently (very slowly) working on is that very problem.
>>>>> Any long running transaction causes that the minxid in the SYNC's is
>>>>> stuck at that very xid during the entire runtime of the LRT. The problem
>>>>> with that is that the log selection in the slon worker uses an index
>>>>> scan who's only index scankey candidates are the minxid of one and the
>>>>> maxxid of another snapshot. That is the range of rows returned by the
>>>>> scan itself. Since the minxid is stuck, it will select larger and larger
>>>>> groups of log tuples only to filter out most of them on a higher level
>>>>> in the query via xxid_le_snapshot().
>>>> How the LRT problem is avoided in PGQ:
>>>> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/skytools/skytools/sql/pgq/functions/pgq.batch_event_sql.sql?rev=1.2&content-type=text/x-cvsweb-markup
>>>> Basic idea is that there are only few LRT's, so its reasonable
>>>> to pick up bottom half of range by event txid, one-by-one.
>>> Hmmm, that is an interesting idea. And it is (in contrast to what
>>> I've been playing with) node insensitive, since it doesn't need info
>>> only available on the event origin, like CLOG. Thanks.
>>
>> Not only is it interesting, but it is astonishing simple to adopt into
>> our code. I want to do some more testing before I commit this change,
>> but the really interesting thing here is that it is only a 3 line
>> change in the remote_worker.c file, which could easily be backported
>> into 1.2.
>>
>> I had created a really pathetic test case here by SIGSTOP'ing the slon
>> while doing the copy_set() for a day, so it had some 90000 events
(Continue reading)


Gmane