Jonathan Vanasco | 4 Jul 22:30 2015

trouble converting several serial queries into a parallel query

I have a very simple query that is giving me some issues due to the size of the database and the number of
requests I make to it in order to compile the report I need:

A dumbed down version of the table and query:

	CREATE TABLE a_to_b (
		id_a INT NOT NULL REFERENCES table_a(id), 
		id_b INT NOT NULL REFERENCES table_b(id),
		PRIMARY KEY (id_a, id_b)
	);
	SELECT id_a, id_b FROM a_2_b WHERE id_a = 1 LIMIT 5;

The problem is that the table has a few million records and I need to query it 30+ times in a row.  

I'd like to improve this with a parallel search using `IN()`

	SELECT id_a, id_b FROM a_2_b WHERE id_a = IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26.27,28,29,30);

That technique has generally fixed a lot of bottlenecks for us.

However I can't wrap my head around structuring it so that I can apply a limit based on the column -- so that I
only get 5 records per id_a.

The table has columns that I would use for ordering in the future, but I'm fine with just getting random
values right now .

Can anyone offer some suggestions?  Thanks in advance.

--

-- 
Sent via pgsql-general mailing list (pgsql-general <at> postgresql.org)
(Continue reading)

Jimit Amin | 4 Jul 21:19 2015
Picon

could not fork new process for connection: Resource temporarily unavailable

ello,

I have heavy transaction load production database 9.3 PPAS .Today Database is not able to give new process. I checked pg_stat_activity , there are so many transaction in waiting stage because of one procedure and lock on one table (Code inside procedure)

SELECT cnt
               FROM table_name_seq_data
WHERE   circle = v_cir_cd
AND type = P_NAME FOR UPDATE; 

I have terminated all transaction related to that procedure , but again locks occurred and no new process allowed.3 times I have performed this thing. After that I have restated my server, then I executed this procedure it takes very less ms. I have to take lock of this table (table_name_seq_data). 

Log files 

  could not fork new process for connection: Resource temporarily unavailable
  could not fork new process for connection: Resource temporarily unavailable
  could not fork autovacuum worker process: Resource temporarily unavailable
  could not fork new process for connection: Resource temporarily unavailable
  could not fork new process for connection: Resource temporarily unavailable


I think because of Application workload this table(table_name_seq_data) is not vacuumed . This table contains only 50 rows only updates occur on this table, And before restart I am not able select this table.


------------------------------------------------------------------------------------------
# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0 # 0-100 milliseconds
#vacuum_cost_page_hit = 1 # 0-10000 credits
#vacuum_cost_page_miss = 10 # 0-10000 credits
#vacuum_cost_page_dirty = 20 # 0-10000 credits
vacuum_cost_limit = 1000 # 1-10000 credits

#vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed

#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------

autovacuum = on # Enable autovacuum subprocess?  'on'
# requires track_counts to also be on.
#log_autovacuum_min_duration = -1 # -1 disables, 0 logs all actions and
# their durations, > 0 logs only
# actions running at least this number
# of milliseconds.
autovacuum_max_workers = 6 # max number of autovacuum subprocesses
# (change requires restart)
#autovacuum_naptime = 1min # time between autovacuum runs
autovacuum_vacuum_threshold = 2000 # min number of row updates before
# vacuum
autovacuum_analyze_threshold = 2000 # min number of row updates before
# analyze
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
# (change requires restart)
#autovacuum_vacuum_cost_delay = 20ms # default vacuum cost delay for
# autovacuum, in milliseconds;
# -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit

Jimit Amin | 4 Jul 11:50 2015
Picon

String match function required like utl_match in oracle

Hello,

I want to compare 2 string and want result like how much percentage or how much part string is compared. Can I know any option like utl_match in oracle

Jimit Amin

john.tiger | 4 Jul 03:28 2015
Picon

return jsonb without field label

have tables with :  create table mydocs (id serial primary key, data jsonb)

when I do a select data from mydocs it comes back as {data: {my jsonb 
fields}} which then gets passed into a template - don't want to have to 
keep using  <%= article.data.name  %>  instead of just data.name

probably easy but can't find out how - any suggestions?

ps: using new pg very lite orm library massive.js  (npm massive) - easy 
to run direct sql commands - very nice.

--

-- 
Sent via pgsql-general mailing list (pgsql-general <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Edson F. Lidorio | 4 Jul 01:32 2015

Download PostgreSQL 9.5 Alpha

Hello,

PostgreSQL 9.5 Alpha not appear on the downloads list in [1]
Where do I download for Windows?

[1] 
http://www.enterprisedb.com/products-services-training/pgdownload#windows

--
Edson

--

-- 
Sent via pgsql-general mailing list (pgsql-general <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Jean-Gérard Pailloncy | 3 Jul 10:15 2015

PostgreSQL & VMWare

Hi,

I work on a project that collects geolocalized data.
All data will be in PostgreSQL / PostGIS.

The small PostgreSQL databases will be on Linux guests on VMWare hosts.

The size of the main database will grow by 50 TB / year, 500 M row / day.
For the largest one, we plan to test different options.
One of them is to stay with Linux on WMWare.
Outside the questions about schema, sharding, I would appreciate if some of you have informations,
benchmarks, stories about big PostgreSQL databases on Linux guests on VMWare hosts.

Regards,
Jean-Gérard Pailloncy

--

-- 
Sent via pgsql-general mailing list (pgsql-general <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

David G. Johnston | 3 Jul 17:27 2015
Picon

Unusual sorting requirement (mixed enum/non-enum) - need thoughts

CREATE TYPE enum_type AS ENUM ('X-One','A-Two');

SELECT * 
FROM (VALUES 
('Not Enum'::text, 1::int, 'Uno'::text), 
('Not Enum', 2, 'Dos'), 
('Enum', 4, 'X-One'),
('Enum', 3, 'A-Two')) val (flag, id, val)
;

I need to write an ORDER BY clause that will result in the output of: 1, 2, 4, 3

Basically, if value is capable of being cast to the enum it should be and the sorting order of the enum used; otherwise order on the value of id.

It doesn't matter how the two groups, enums and non-enums, sort relative to each other - for any execution of the query either all values will be enum-able or none will be.

Trying to do this in pure SQL though I'm thinking I'll have to do this in pl/pgsql and put the "is enum-able" check external to the query and either use dynamic SQL or write two separate queries.

I tried casting the enum to an integer but it would not let me :(

Thanks in advanced for any thoughts.

David J.

Fabio Pardi | 3 Jul 12:14 2015
Picon

max number of locks

Hi,


while experimenting with number of locks, i found something I cannot understand.

From what i can read in the documentation, at any one given time, a query can obtain a max number of locks given by

max_locks_per_transaction * (max_connections + max_prepared_transactions)

I then changed my db to use this settings:

mydb=# show max_locks_per_transaction ;
 max_locks_per_transaction
---------------------------
 20
(1 row)

mydb=# show max_connections ;
 max_connections
-----------------
 2
(1 row)

mydb=# show  max_prepared_transactions ;
 max_prepared_transactions
---------------------------
 0
(1 row)

so i expected to be able to acquire a maximum of 40 locks.


Then:


mydb=# begin transaction ;
BEGIN
portavita=# SELECT 1 FROM root.ac;
 ?column?
----------
(0 rows)

mydb=# select count(*) from pg_locks ;
 count
-------
   132
(1 row)


Why can I acquire 132 locks while the expected number is 40? What am I doing wrong?

I m running Postgres 9.2.6



Thanks for your time,



Fabio

 
howardnews@selestial.com | 3 Jul 11:59 2015

Backup Method

Hi everyone,

I am reaching the point with my database backups where the backups are 
taking too long, and starting to interfere with running of the system 
during the day. So I am looking for a bit of sage advice as to how to 
proceed.

For a typical server, I have a single database cluster with multiple 
database that I backup nightly using sequential pg_dumps to NFS.

So what I am looking for is a backup regime with less impact that will 
run overnight but faster or less impact than the pg_dump.

Would a file system backup be the better option in PITR format, or is 
pg_dumpall more efficient than pg_dump?

Look forward to your advice.

Howard.

--

-- 
Sent via pgsql-general mailing list (pgsql-general <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Christian Schröder | 3 Jul 07:35 2015
Picon

Slow index performance

Hi all,
we have a strange performance issue in one of our databases (using PostgreSQL 9.1.18). Maybe you can help me
understand what’s going on.

We have two identical tables (rec_isins_current, rec_isins_archive) with the following structure:

Table "ts_frontend.rec_isins_current"
   Column   |  Type   | Modifiers
------------+---------+-----------
 attachment | integer | not null
 isin       | isin    | not null
Indexes:
    "rec_isins_current_pkey" PRIMARY KEY, btree (attachment, isin), tablespace "extra"
    "rec_isins_current_attachment" btree (attachment), tablespace "extra"
Foreign-key constraints:
    "rec_isins_attachment_fkey" FOREIGN KEY (attachment) REFERENCES ts_frontend.attachments(id) ON
UPDATE RESTRICT ON DELETE CASCADE
Inherits: ts_frontend.rec_isins

The isin type is a domain type which has char(12) as its base type.
Both tables inherit from ts_frontend.rec_isins, which is empty and is only used to search both tables in a
single query.

When we search for an isin in both tables (using the parent table, but the behavior is the same if we directly
search in one of the tables), the primary key index is used. However, while the archive table is pretty
fast, the current table is much slower:

# explain analyze select * from ts_frontend.rec_isins where isin = 'foo';
                                                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..565831.43 rows=501 width=17) (actual time=6080.778..6080.778 rows=0 loops=1)
   ->  Append  (cost=0.00..565831.43 rows=501 width=17) (actual time=6080.777..6080.777 rows=0 loops=1)
         ->  Seq Scan on rec_isins  (cost=0.00..0.00 rows=1 width=36) (actual time=0.001..0.001 rows=0 loops=1)
               Filter: ((isin)::bpchar = 'foo'::bpchar)
         ->  Index Scan using rec_isins_archive_pkey on rec_isins_archive rec_isins  (cost=0.00..621.61
rows=405 width=17) (actual time=10.335..10.335 rows=0 loops=1)
               Index Cond: ((isin)::bpchar = 'foo'::bpchar)
         ->  Index Scan using rec_isins_current_pkey on rec_isins_current rec_isins  (cost=0.00..565209.82
rows=95 width=17) (actual time=6070.440..6070.440 rows=0 loops=1)
               Index Cond: ((isin)::bpchar = 'foo'::bpchar)
 Total runtime: 6080.824 ms

This is strange, because the archive table is four times larger than the current table and the archive index
is also four times larger than the current index:

           relname            | relfilenode | reltablespace | pg_table_size
------------------------------+-------------+---------------+---------------
 rec_isins                    |   514533886 |             0 |          8192
 rec_isins_pkey               |   514533892 |             0 |          8192
 rec_isins_attachment         |   514533899 |             0 |          8192
------------------------------+-------------+---------------+---------------
 rec_isins_archive            |   507194804 |             0 |   10923393024
 rec_isins_archive_pkey       |   507197615 |     139300915 |    9048784896
 rec_isins_archive_attachment |   507197692 |     139300915 |    4706050048
------------------------------+-------------+---------------+---------------
 rec_isins_current            |   631621090 |             0 |    2696216576
 rec_isins_current_pkey       |   631621096 |     139300915 |    2098552832
 rec_isins_current_attachment |   631621107 |     139300915 |    1160683520

Both tables are in the same tablespace (and thus on the same disk) and both indexes are also in the same
tablespace (but in another than the tables).
The current table has been vacuumed full and reindexed.

Can anybody explain the difference? Why is the current table so slow? And what can we do to improve performance?

Thanks for your help,
Christian

------------------------------------------------------------
Deriva GmbH Financial IT and Consulting
Christian Schröder
Geschäftsführer
Hans-Böckler-Straße 2 | D-37079 Göttingen
Tel: +49 (0)551 489 500-42
Fax: +49 (0)551 489 500-91
http://www.deriva.de

Amtsgericht Göttingen | HRB 3240
Geschäftsführer: Christian Schröder

--

-- 
Sent via pgsql-general mailing list (pgsql-general <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Alexander Shereshevsky | 2 Jul 16:31 2015
Picon

record from plpgsql function performance

Hello,

I have some simple function. The returned data set is generated based on view (dynamic - can be changed on daily basis). 
So the function was defined this way:

 1. returns setof some_view as ...
 2. inside the function I'm generating dynamic SQL into v_sql variable.
 3. return query execute v_sql

Everything works fine if I'm running single function, like:
select * from function (param1, param2)
I'm getting the delimited fields in desired order.

But if I want to run the function in inline mode, like:
select function(param1, param2) from some_table;
so the returned datatype is record.

To return multiple fields, I'm trying to use:
select (function(param1, param2)).* from some_table;
But this operation has a huge performance impact, IMHO runtime multiplied by number of columns - the function is executed for each column separately. 
In my case normal inline run is about 2 seconds for 300-400 records, but with ().* it's increased to 90-120 seconds.

Thank you in advance if you can suggest me the better way.

BR,
Alexander Shereshevsky 

Gmane