Toshihiro Kitagawa | 1 Jun 13:38 2011
Picon

pgpool-II 3.0.4 released

Pgpool Global Development Group is pleased to announce the availability
of pgpool-II 3.0.4, the latest stable version of pgpool-II 3.0 series.

This version fixes various bugs since 3.0.3.

Incompatible changes
====================

- In streaming replication, if delay_threshold is 0 or health
  checking is disabled, the delay checking is not performed.
  This is the behaviour according to a description of the
  pgpool-II manual. But, so far the delay checking was performed
  even if health checking was disabled(Guillaume Lelarge)

Bug fixes
=========

- Fix pgpool-regclass() to be compiled in PostgreSQL 8.0 or later.
  7.4 still produces errors(Tatsuo)

- Fix possible hangup when using /*NO LOAD BALANCE*/ comment in
  streaming replication(Kitagawa)

- Fix hangup when received Flush(H) message or CloseComplete(C)
  message(Kitagawa)

- Fix possible hangup that happen for the receiving timing of
  ReadyForQuery(Z) message after pgpool-II connects to backends(Kitagawa)

- Add description about parameters for recovery_1st_stage_command
(Continue reading)

Drew Robb | 2 Jun 18:17 2011
Picon

Parallel query order by / limit

Hello, I'm fairly new to pgpool (enjoying it so far) and I'm trying to understand parallel query behavior. I'm using pgpool 3.0.4 and pqsql 9.0.

I have a parallel table USERS distributed via the primary key user_id.  When I execute the following query

select user_id, date_created from users order by date_created limit 10;

I get the following output from pgpool debug:
statement2: SELECT pool_parallel("SELECT user_id, users.date_created FROM users")

My understanding is that the parallel engine is unable to translate the order by and limit clause to into the query executed on the worker nodes? This creates a big performance problem since all records in the table are seemingly being processed.  I read the documentation, but was unable to figure out why this is the case. It seems to me that the order by and limit clauses could be applied to both the workers (returning upto N*10 records) and then reapplied to filter these results down to just 10 records. Is the behavior i'm seeing expected, does the optimization I had in mind make sense?

Either way, is there a way that I could specify exactly the parallel query to be executing on the workers to avoid this problem? I know that creating a particular view in and registering it in the dist_def table might work, but this isn't very dynamic.

Thanks,
Drew
_______________________________________________
Pgpool-general mailing list
Pgpool-general@...
http://pgfoundry.org/mailman/listinfo/pgpool-general
Bob Lunney | 2 Jun 18:48 2011
Picon

Re: Parallel query order by / limit

Drew,

PgPool has to get all of the rows to order them, then it can limit how many are returned.  Unfortunately,
there is no magic distributed index that can help with this operation.  (Imagine fetching a row through
an index on a backend database and comparing it to all other backend databases to determine if that row is or
is not included in the result set.  Yikes!)  You will find similar issues with group by and other clauses
that require a sort.  Subqueries can be problematic, too.

Note the data is returned to the PG cluster running pgpool for your sort and limit operations.  This
operation used dblink, which allocates memory to hold the rows and can exhaust memory if the amount of data
is too large.  If you want to run a partitioned datawarehouse I suggest looking at Greenplum.  If you
have targeted data to retrieve, pgpool can do that.

PgPool works well in parallel mode if you remember these limitations.  I'm running several ginormous
databases with it and get excellent response.

Good luck!

Bob Lunney
________________________________
From: Drew Robb <drewrobb@...>
To: pgpool-general@...
Sent: Thursday, June 2, 2011 11:17 AM
Subject: [Pgpool-general] Parallel query order by / limit

Hello, I'm fairly new to pgpool (enjoying it so far) and I'm trying to 
understand parallel query behavior. I'm using pgpool 3.0.4 and pqsql 
9.0.

I have a parallel table USERS distributed via the primary key user_id.  When I execute the following query

select user_id, date_created from users order by date_created limit 10;
>
I get the following output from pgpool debug:

statement2: SELECT pool_parallel("SELECT user_id, users.date_created FROM users")
>

My understanding is that the parallel engine is unable to translate the order by 
and limit clause to into the query executed on the worker nodes? This 
creates a big performance problem since all records in the table are 
seemingly being processed.  I read the documentation, but was unable to 
figure out why this is the case. It seems to me that the order by and 
limit clauses could be applied to both the workers (returning upto N*10 
records) and then reapplied to filter these results down to just 10 
records. Is the behavior i'm seeing expected, does the optimization I had in mind make sense?

Either way, is there a way that I could specify exactly the parallel query 
to be executing on the workers to avoid this problem? I know that 
creating a particular view in and registering it in the dist_def table might work, but this isn't very dynamic.

Thanks,
Drew
_______________________________________________
Pgpool-general mailing list
Pgpool-general@...
http://pgfoundry.org/mailman/listinfo/pgpool-general
Vincent Sheffer | 2 Jun 20:27 2011
Picon

JDBC Query Not Working in Parallel Mode

Here are my particulars:

pgpool-II 3.0.3
postgresql 9.0.4
JDBC4 driver v9.0-801
JDK 1.6 running on Ubuntu Linux

I've got pgpool-II configured in parallel mode and have partitioning rules set up via dist_def and
systemdb.  I do a simple, non prepared statement, query in JDBC and get no results back.  The exact same query
(that I copied and pasted from the pgpool-II debug log) run through psql returns the correct result set.

I am continuing to investigate on my end, but any help here would be greatly appreciated.

Thanks,
Vince
Toshihiro Kitagawa | 3 Jun 04:18 2011
Picon

Re: JDBC Query Not Working in Parallel Mode

On Thu, 02 Jun 2011 11:27:18 -0700
Vincent Sheffer <vincent.sheffer@...> wrote:

> Here are my particulars:
> 
> pgpool-II 3.0.3
> postgresql 9.0.4
> JDBC4 driver v9.0-801
> JDK 1.6 running on Ubuntu Linux
> 
> I've got pgpool-II configured in parallel mode and have partitioning rules set up via dist_def and
systemdb.  I do a simple, non prepared statement, query in JDBC and get no results back.  The exact same query
(that I copied and pasted from the pgpool-II debug log) run through psql returns the correct result set.

It's a restriction in parallel mode.

Excerpts from pgpool-II manual:
* Extended Query Protocol (for parallel mode)
  The extended query protocol used by JDBC drivers, etc. is not supported. 

JDBC drivers(except earlier ones) use the extended query protocol
regardless of prepared statement. On the one hand, the drivers using
libpq might use the extended query protocol if API for prepared
statement is called.

> 
> I am continuing to investigate on my end, but any help here would be greatly appreciated.
> 
> Thanks,
> Vince
> 
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general@...
> http://pgfoundry.org/mailman/listinfo/pgpool-general
> 

--

-- 
Toshihiro Kitagawa
SRA OSS, Inc. Japan
Vincent Sheffer | 3 Jun 17:10 2011
Picon

Re: JDBC Query Not Working in Parallel Mode

This is what my manual says:

"The extended query protocol used by JDBC drivers, etc. is not supported. The simple query protocol must be used. This means you can't use prepared statements."

I am not using prepared statements.

On Jun 2, 2011, at 7:18 PM, Toshihiro Kitagawa wrote:

On Thu, 02 Jun 2011 11:27:18 -0700
Vincent Sheffer <vincent.sheffer-ee4meeAH724@public.gmane.org> wrote:

Here are my particulars:

pgpool-II 3.0.3
postgresql 9.0.4
JDBC4 driver v9.0-801
JDK 1.6 running on Ubuntu Linux

I've got pgpool-II configured in parallel mode and have partitioning rules set up via dist_def and systemdb.  I do a simple, non prepared statement, query in JDBC and get no results back.  The exact same query (that I copied and pasted from the pgpool-II debug log) run through psql returns the correct result set.

It's a restriction in parallel mode.

Excerpts from pgpool-II manual:
* Extended Query Protocol (for parallel mode)
 The extended query protocol used by JDBC drivers, etc. is not supported.

JDBC drivers(except earlier ones) use the extended query protocol
regardless of prepared statement. On the one hand, the drivers using
libpq might use the extended query protocol if API for prepared
statement is called.


I am continuing to investigate on my end, but any help here would be greatly appreciated.

Thanks,
Vince

_______________________________________________
Pgpool-general mailing list
Pgpool-general-JL6EbXIHTPOxbKUeIHjxjQ@public.gmane.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


--
Toshihiro Kitagawa
SRA OSS, Inc. Japan


_______________________________________________
Pgpool-general mailing list
Pgpool-general@...
http://pgfoundry.org/mailman/listinfo/pgpool-general
Picon

Error during md5 authentication

Hello, every one in the list, When a try to connect to postgresql servers through pgpool-II using md5
authetication method, in pgpool-II log file, this error is shown:

trying md5 authentication 	
2011-06-05 	18:00:55 	DEBUG: 	pid 	8147: 	read_password_packet_password: failed to read password
packet "p" 	
2011-06-05 	18:00:55 	ERROR: 	pid 	8147: 	do_md5: read_password_packet failed 	
2011-06-05 	18:00:55 	DEBUG: 	pid 	8147: 	do_md5failed in slot 0 

Can any body tell my, why is the reason of this problem???

thank you very much for your time.
Regards.
Picon

About insert_lock = true


Hello every one in the list, I have a question related with insert_lock property. In the documentation says
that if insert_lock is set to true pgpool-II automatically adds the next queries each time an INSERT is executed:

BEGIN;
LOCK TABLE ...
INSERT INTO ...
COMMIT;

But in postgresql logs file, I don't see a lock table statement. The logs writing in postgresql logs file are:

2011-06-05 16:59:16 VET [18284]: [3-1] LOG:  statement: BEGIN
2011-06-05 16:59:16 VET [18284]: [4-1] LOG:  duration: 0.162 ms
2011-06-05 16:59:16 VET [18284]: [5-1] LOG:  statement: SELECT 1 FROM prueba_id_seq FOR UPDATE
2011-06-05 16:59:16 VET [18284]: [6-1] LOG:  duration: 0.771 ms
2011-06-05 16:59:16 VET [18284]: [7-1] LOG:  statement: INSERT INTO prueba(nombre) VALUES('Ire');
2011-06-05 16:59:16 VET [18284]: [8-1] LOG:  duration: 0.681 ms
2011-06-05 16:59:16 VET [18284]: [9-1] LOG:  statement: COMMIT

This is an expected behavior?? I'm using pgpool-3.0.4 with postgresql 8.3.14

Thank you very much for your time.

Regards.
Tatsuo Ishii | 6 Jun 02:43 2011
Picon

Re: About insert_lock = true

> Hello every one in the list, I have a question related with insert_lock property. In the documentation
says that if insert_lock is set to true pgpool-II automatically adds the next queries each time an INSERT
is executed:
> 
> 
> BEGIN;
> LOCK TABLE ...
> INSERT INTO ...
> COMMIT;
> 
> But in postgresql logs file, I don't see a lock table statement. The logs writing in postgresql logs file are:

>From the document:

	   pgpool-II 3.0 or later uses a row lock against the sequence
	   relation, rather than table lock. This will minimize lock
	   conflict with VACUUM (including autovacuum).

So you see "SELECT 1 FROM prueba_id_seq FOR UPDATE" instead of LOCK TABLE...

> 2011-06-05 16:59:16 VET [18284]: [3-1] LOG:  statement: BEGIN
> 2011-06-05 16:59:16 VET [18284]: [4-1] LOG:  duration: 0.162 ms
> 2011-06-05 16:59:16 VET [18284]: [5-1] LOG:  statement: SELECT 1 FROM prueba_id_seq FOR UPDATE
> 2011-06-05 16:59:16 VET [18284]: [6-1] LOG:  duration: 0.771 ms
> 2011-06-05 16:59:16 VET [18284]: [7-1] LOG:  statement: INSERT INTO prueba(nombre) VALUES('Ire');
> 2011-06-05 16:59:16 VET [18284]: [8-1] LOG:  duration: 0.681 ms
> 2011-06-05 16:59:16 VET [18284]: [9-1] LOG:  statement: COMMIT
> 
> This is an expected behavior?? I'm using pgpool-3.0.4 with postgresql 8.3.14

Yes.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
Picon

Re: About insert_lock = true

Tatsuo thank you very much for your answer.
Regards.
________________________________________
De: Tatsuo Ishii [ishii@...]
Enviado el: domingo, 05 de junio de 2011 20:13
Para: Lazaro Rubén García Martinez
CC: pgpool-general@...
Asunto: Re: [Pgpool-general] About insert_lock = true

> Hello every one in the list, I have a question related with insert_lock property. In the documentation
says that if insert_lock is set to true pgpool-II automatically adds the next queries each time an INSERT
is executed:
>
>
> BEGIN;
> LOCK TABLE ...
> INSERT INTO ...
> COMMIT;
>
> But in postgresql logs file, I don't see a lock table statement. The logs writing in postgresql logs file are:

>From the document:

           pgpool-II 3.0 or later uses a row lock against the sequence
           relation, rather than table lock. This will minimize lock
           conflict with VACUUM (including autovacuum).

So you see "SELECT 1 FROM prueba_id_seq FOR UPDATE" instead of LOCK TABLE...

> 2011-06-05 16:59:16 VET [18284]: [3-1] LOG:  statement: BEGIN
> 2011-06-05 16:59:16 VET [18284]: [4-1] LOG:  duration: 0.162 ms
> 2011-06-05 16:59:16 VET [18284]: [5-1] LOG:  statement: SELECT 1 FROM prueba_id_seq FOR UPDATE
> 2011-06-05 16:59:16 VET [18284]: [6-1] LOG:  duration: 0.771 ms
> 2011-06-05 16:59:16 VET [18284]: [7-1] LOG:  statement: INSERT INTO prueba(nombre) VALUES('Ire');
> 2011-06-05 16:59:16 VET [18284]: [8-1] LOG:  duration: 0.681 ms
> 2011-06-05 16:59:16 VET [18284]: [9-1] LOG:  statement: COMMIT
>
> This is an expected behavior?? I'm using pgpool-3.0.4 with postgresql 8.3.14

Yes.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Gmane