pinker | 23 Oct 15:24 2014
Picon

Checkpoints tuning

I have saved data from pg_stat_bgwriter view following Greg Smith's advice
from his book:
select now(),* from pg_stat_bgwriter; 

and then aggregated the data with query from his book as well.

checkpoint segments was first 30 and next day I have increased it to 200,
and results has changed:

<http://postgresql.1045698.n5.nabble.com/file/n5824026/Auswahl_235.png> 

now percent of checkpoints required because of number of segments is bigger
and backend writer share is also too high- I assume it's not what should
happen.
I'm not sure how to interpret correlation between allocation and written
data?
The bigger amount of data written per sec is a good sign?

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Checkpoints-tuning-tp5824026.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Montana Low | 22 Oct 00:25 2014
Picon

ERROR: out of memory | with 23GB cached 7GB reserved on 30GB machine

I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel 3.16.3. I receive numerous Error: out of memory messages in the log, which are aborting client requests, even though there appears to be 23GB available in the OS cache.

There is no swap on the box. Postgres is behind pgbouncer to protect from the 200 real clients, which limits connections to 32, although there are rarely more than 20 active connections, even though postgres max_connections is set very high for historic reasons. There is also a 4GB java process running on the box.




relevant postgresql.conf:

max_connections = 1000                  # (change requires restart)
shared_buffers = 7GB                    # min 128kB
work_mem = 40MB                         # min 64kB
maintenance_work_mem = 1GB              # min 1MB
effective_cache_size = 20GB



sysctl.conf:

vm.swappiness = 0
vm.overcommit_memory = 2
kernel.shmmax=34359738368
kernel.shmall=8388608



log example:

ERROR:  out of memory
DETAIL:  Failed on request of size 67108864.
STATEMENT:  SELECT  "package_texts".* FROM "package_texts"  WHERE "package_texts"."id" = $1 LIMIT 1



example pg_top, showing 23GB available in cache:

last pid:  6607;  load avg:  3.59,  2.32,  2.61;       up 16+09:17:29 20:49:51
18 processes: 1 running, 17 sleeping
CPU states: 22.5% user,  0.0% nice,  4.9% system, 63.2% idle,  9.4% iowait
Memory: 29G used, 186M free, 7648K buffers, 23G cached
DB activity: 2479 tps,  1 rollbs/s, 217 buffer r/s, 99 hit%,  11994 row r/s, 3820 row w/s  
DB I/O:     0 reads/s,     0 KB/s,     0 writes/s,     0 KB/s  
DB disk: 149.8 GB total, 46.7 GB free (68% used)
Swap:



example top showing the only other significant 4GB process on the box:

top - 21:05:09 up 16 days,  9:32,  2 users,  load average: 2.73, 2.91, 2.88
Tasks: 147 total,   3 running, 244 sleeping,   0 stopped,   0 zombie
%Cpu(s): 22.1 us,  4.1 sy,  0.0 ni, 62.9 id,  9.8 wa,  0.0 hi,  0.7 si,  0.3 st
KiB Mem:  30827220 total, 30642584 used,   184636 free,     7292 buffers
KiB Swap:        0 total,        0 used,        0 free. 23449636 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                   7407 postgres  20   0 7604928  10172   7932 S  29.6  0.0   2:51.27 postgres
10469 postgres  20   0 7617716 176032 160328 R  11.6  0.6   0:01.48 postgres
10211 postgres  20   0 7630352 237736 208704 S  10.6  0.8   0:03.64 postgres
18202 elastic+  20   0 8726984 4.223g   4248 S   9.6 14.4 883:06.79 java
9711 postgres  20   0 7619500 354188 335856 S   7.0  1.1   0:08.03 postgres
3638 postgres  20   0 7634552 1.162g 1.127g S   6.6  4.0   0:50.42 postgres
 
Björn Wittich | 21 Oct 19:26 2014
Picon
Picon

extremly bad select performance on huge table

Hi newsgroup,

I have a very huge table (70 mio rows ) with a key (text length about 30 
characters each key). A select on this indexed column "myprimkey" (index 
on column mycolumn)  took more than 30 mins.

Here is the explain (analyze,buffers) select mycolumn from myhugetable

"Index Only Scan using myprimkey on myhugetable  (cost=0.00..8224444.82 
rows=71768080 width=33) (actual time=16.722..2456300.778 rows=71825999 
loops=1)"

"  Heap Fetches: 356861"

"Total runtime: 2503009.611 ms"

Even repeating the query does not show a performance improvement. I 
assume that the index itself is too large for my db cache. What can I do 
to gain performance? Which parameters can I adapt? Having a huge Linux 
machine with 72 GB RAM.

Note: This select is just for testing. My final statement will be a join 
on this table via the "mycolumn" column.

Thanks for your help
Björn

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

john | 21 Oct 14:57 2014

Query Performance Problem



Hi all,

I'm experimenting with table partitioning though inheritance. I'm testing a query as follows:

explain (analyze, buffers)
select response.id
from claim.response
where response.account_id = 4766
and response.expire_timestamp is null
and response.create_timestamp >= DATE '2014-08-01'
order by create_timestamp;

The response table looks like this:
"account_id";"integer"
"file_type_id";"integer"
"receiver_inbound_detail_id";"integer"
"processing_status_id";"integer"
"processing";"boolean"
"expire_timestamp";"timestamp without time zone"
"last_mod_timestamp";"timestamp without time zone"
"create_timestamp";"timestamp without time zone"
"response_trace_nbr";"character varying"
"posted_timestamp";"timestamp without time zone"
"need_to_post";"boolean"
"response_message";"text"
"worked";"boolean"
"response_status_id";"integer"
"response_type_id";"integer"
"outbound_claim_detail_id";"bigint"
"id";"bigint"

Here are some rowcounts:

SELECT count(*) from claim_response.response_201408;
  count
---------
 4585746
(1 row)

Time: 7271.054 ms
SELECT count(*) from claim_response.response_201409;
  count
---------
 3523370
(1 row)

Time: 4341.116 ms
SELECT count(*) from claim_response.response_201410;
 count
-------
   154
(1 row)

Time: 0.258 ms

The entire table has 225,665,512 rows. I read that a partitioning rule of thumb is that benefits of partitioning occur starting around 100 million rows.

SELECT count(*) from claim.response;
   count
-----------
 225665512
(1 row)

Time: 685064.637 ms


The partitioning is on the create_timestamp field.

The server is Red Hat Enterprise Linux Server release 6.2 (Santiago) on a VM machine - 8 GB RAM with 2 CPUs:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
CPU socket(s):         1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Stepping:              2
CPU MHz:               2660.000
BogoMIPS:              5320.00
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0,1



2 users,  load average: 0.00, 0.12, 0.37


Please see the following for the explain analysis :


I'm trying to understand why I'm getting the yellow, orange, and red on the inclusive, and the yellow on the exclusive. (referring to the explain.depesz.com/s/I3SL page.)
I'm relatively new to PostgreSQL, but I've been an Oracle DBA for some time. I suspect the I/O may be dragging but I don't know how to dig that information out from here. Please point out anything else you can decipher from this. 

Thanks,

John
Marco Di Cesare | 20 Oct 22:32 2014

Query with large number of joins

We are using Postgres for the first time after being SQLServer users for a long time so forgive for being noobs.

 

We are using a BI tool that generates a query with an unusually large number of joins. My understanding is that with this many joins Postgres query planner can't possibly use an exhaustive search so it drops into a heuristics algorithm. Unfortunately, the query runs quite slow (~35 seconds) and seems to ignore using primary keys and indexes where available. 

 

Query plan here (sorry had to anonymize):

http://explain.depesz.com/s/Uml

 

Line 30 is one of the pain points where a full table scan is running on 4.2 million rows even though there are indexes on oscar_bravo.foxtrot_four and oscar_charlie.foxtrot_four

 

We've tried to play around with the join_collapse_limit value by upping it from the default of 8 to 10 or 12 but it doesn't seem to help much. Cranking the value up to an unreasonable value of 20 does shave some seconds off the query time but not substantially (explain plan with the value set to 20: http://explain.depesz.com/s/sW6).

 

We haven't tried playing around with the geqo_threshold at this point.

 

Any thoughts on ways to speed up the run time of this query or any other Postgres settings we should be aware of when dealing with this unusually large number of joins?

 

Thanks in advance

 

 

 

Marco Di Cesare

 

Laurent Martelli | 19 Oct 06:10 2014

IS NOT NULL and LEFT JOIN

Hello there,

I have a strange query plan involving an IS NOT NULL and a LEFT JOIN.

I grant you that the query can be written without the JOIN on user_user_info,
but it is generated like this by hibernate. Just changing the IS NOT NULL condition
to the other side of useless JOIN makes a big difference in the query plan :

-- THE BAD ONE : given the selectivity on c.name and c.email, barely more than one row will ever be returned
explain analyze select c.*
       from contact_contact c
       left outer join user_user_info u on c.user_info=u.id
       left outer join contact_address a on c.address=a.id
      where lower(c.name)='martelli'
        and c.email='dsfze <at> ezrfz.com' or u.id is not null;
                                                           QUERY PLAN                                                          
--------------------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=1.83..2246.76 rows=59412 width=4012) (actual time=53.645..53.645 rows=0 loops=1)
   Hash Cond: (c.user_info = u.id)
   Filter: (((lower((c.name)::text) = 'martelli'::text) AND ((c.email)::text = 'dsfze <at> ezrfz.com'::text)) OR (u.id IS NOT NULL))
   Rows Removed by Filter: 58247
   ->  Seq Scan on contact_contact c  (cost=0.00..2022.12 rows=59412 width=4012) (actual time=0.007..6.892 rows=58247 loops=1)
   ->  Hash  (cost=1.37..1.37 rows=37 width=8) (actual time=0.029..0.029 rows=37 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 2kB
         ->  Seq Scan on user_user_info u  (cost=0.00..1.37 rows=37 width=8) (actual time=0.004..0.015 rows=37 loops=1)
 Planning time: 0.790 ms
 Execution time: 53.712 ms

-- THE GOOD ONE (test IS NOT NULL on contact0_.user_info instead of userinfo1_.id)
explain analyze select c.*         
       from contact_contact c        
       left outer join user_user_info u on c.user_info=u.id          
       left outer join contact_address a on c.address=a.id         
      where lower(c.name)='martelli'
        and c.email='dsfze <at> ezrfz.com' or c.user_info is not null;
                                                                 QUERY PLAN                                                                
--------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on contact_contact c  (cost=8.60..16.41 rows=1 width=4012) (actual time=0.037..0.037 rows=0 loops=1)
   Recheck Cond: (((email)::text = 'dsfze <at> ezrfz.com'::text) OR (user_info IS NOT NULL))
   Filter: (((lower((name)::text) = 'martelli'::text) AND ((email)::text = 'dsfze <at> ezrfz.com'::text)) OR (user_info IS NOT NULL))
   ->  BitmapOr  (cost=8.60..8.60 rows=2 width=0) (actual time=0.034..0.034 rows=0 loops=1)
         ->  Bitmap Index Scan on idx_contact_email  (cost=0.00..4.30 rows=2 width=0) (actual time=0.027..0.027 rows=0 loops=1)
               Index Cond: ((email)::text = 'dsfze <at> ezrfz.com'::text)
         ->  Bitmap Index Scan on contact_contact_user_info_idx  (cost=0.00..4.30 rows=1 width=0) (actual time=0.007..0.007 rows=0 loops=1)
               Index Cond: (user_info IS NOT NULL)
 Planning time: 0.602 ms
 Execution time: 0.118 ms
My tables are as follow, and I use postgres 9.4 : Table « public.contact_contact » Colonne | Type | Modificateurs | Stockage | Cible de statistiques | Description ------------------------+-----------------------------+---------------+----------+-----------------------+------------- id | bigint | non NULL | plain | | archived | boolean | | plain | | version | integer | | plain | | created_on | timestamp without time zone | | plain | | updated_on | timestamp without time zone | | plain | | actor_ref | character varying(255) | | extended | | addressl1 | character varying(255) | | extended | | comment | text | | extended | | contact_partner_ok | boolean | | plain | | date_of_birth | date | | plain | | email | character varying(255) | | extended | | email_pro | character varying(255) | | extended | | fax | character varying(255) | | extended | | first_name | character varying(255) | | extended | | fixed_phone1 | character varying(255) | | extended | | fixed_phone2 | character varying(255) | | extended | | fixed_phone_pro | character varying(255) | | extended | | import_key1 | character varying(255) | | extended | | import_key2 | character varying(255) | | extended | | koala_id | character varying(255) | | extended | | mobile_phone_perso | character varying(255) | | extended | | mobile_phone_pro | character varying(255) | | extended | | name | character varying(255) | non NULL | extended | | ola_email | character varying(255) | | extended | | ola_phone | character varying(255) | | extended | | person_category_select | character varying(255) | | extended | | web_site | character varying(255) | | extended | | year_of_birth | integer | | plain | | created_by | bigint | | plain | | updated_by | bigint | | plain | | action_event_source | bigint | | plain | | address | bigint | | plain | | address_pro | bigint | | plain | | jobtitle | bigint | | plain | | merged_with | bigint | | plain | | nationality_country | bigint | | plain | | origin | bigint | | plain | | place_of_birth_address | bigint | | plain | | title | bigint | | plain | | user_info | bigint | | plain | | import_origin | character varying(255) | | extended | | duplicates | bigint | | plain | | Index : "contact_contact_pkey" PRIMARY KEY, btree (id) "uk_bx19539x7h0y0w4p4uw9gnqbo" UNIQUE CONSTRAINT, btree (koala_id) "uk_vg25de8jcu18m89o9dy2n4fe" UNIQUE CONSTRAINT, btree (import_key1) "contact_contact_action_event_source_idx" btree (action_event_source) "contact_contact_address_idx" btree (address) "contact_contact_address_l1_idx" btree (addressl1) "contact_contact_address_pro_idx" btree (address_pro) "contact_contact_jobtitle_idx" btree (jobtitle) "contact_contact_merged_with_idx" btree (merged_with) "contact_contact_name_idx" btree (name) "contact_contact_nationality_country_idx" btree (nationality_country) "contact_contact_origin_idx" btree (origin) "contact_contact_place_of_birth_address_idx" btree (place_of_birth_address) "contact_contact_title_idx" btree (title) "contact_contact_user_info_idx" btree (user_info) "idx_contact_email" btree (email) "idx_contact_lower_name" btree (lower(name::text)) "idx_contact_search_name" btree (lower(name::text), lower(first_name::text)) Contraintes de clés étrangères : "fk_8dj7rw3jrdxk4vxbi6vony0ne" FOREIGN KEY (created_by) REFERENCES auth_user(id) "fk_9s1dhwrvw6lq74fvty6oj2wc5" FOREIGN KEY (address_pro) REFERENCES contact_address(id) "fk_9wjsgh8lt5ixbshx9pjwmjtk1" FOREIGN KEY (origin) REFERENCES crm_origin(id) "fk_ad53x8tdando1w1jdlyxcop9v" FOREIGN KEY (duplicates) REFERENCES contact_contact(id) "fk_edusucr1gdfj99vtm0a70gggg" FOREIGN KEY (title) REFERENCES contact_title(id) "fk_g7u75rjd754m7evn2alckjvka" FOREIGN KEY (merged_with) REFERENCES contact_contact(id) "fk_j72hkuq0337v6utjbf85hhvxg" FOREIGN KEY (action_event_source) REFERENCES crm_action_event_source(id) "fk_k73mcu7swia6uf6qpp4v6lwxf" FOREIGN KEY (updated_by) REFERENCES auth_user(id) "fk_mvpl7wudcdqgitmmsd900od97" FOREIGN KEY (place_of_birth_address) REFERENCES contact_address(id) "fk_onriw4jpgeuvhfk827amxry8k" FOREIGN KEY (address) REFERENCES contact_address(id) "fk_rpkvno8705gap9ejj4wnnb7hl" FOREIGN KEY (nationality_country) REFERENCES territory_country(id) "fk_s9fsy33u5a9ke8wee9mc2vpsx" FOREIGN KEY (user_info) REFERENCES user_user_info(id) "fk_t8uexb8lmgaftjsnn63eoty90" FOREIGN KEY (jobtitle) REFERENCES contact_jobtitle(id) coopener=# \d+ user_user_info Table « public.user_user_info » Colonne | Type | Modificateurs | Stockage | Cible de statistiques | Description -----------------+-----------------------------+---------------+----------+-----------------------+------------- id | bigint | non NULL | plain | | archived | boolean | | plain | | version | integer | | plain | | created_on | timestamp without time zone | | plain | | updated_on | timestamp without time zone | | plain | | full_name | character varying(255) | | extended | | import_key | character varying(255) | | extended | | import_username | character varying(255) | | extended | | today | timestamp without time zone | | plain | | user_system_ok | boolean | | plain | | created_by | bigint | | plain | | updated_by | bigint | | plain | | active_company | bigint | | plain | | agency | bigint | | plain | | internal_user | bigint | non NULL | plain | | Index : "user_user_info_pkey" PRIMARY KEY, btree (id) "uk_99o17944ddytysui6b05lxyb2" UNIQUE CONSTRAINT, btree (import_key) "uk_cqgrw75h35ts19uixn03rkjsu" UNIQUE CONSTRAINT, btree (internal_user) "uk_jtsvu4r7s12nnh9o2sloqyqv4" UNIQUE CONSTRAINT, btree (import_username) "user_user_info_active_company_idx" btree (active_company) "user_user_info_agency_idx" btree (agency) "user_user_info_full_name_idx" btree (full_name) Contraintes de clés étrangères : "fk_cojxp4r7d8n2l135gy4xa4vak" FOREIGN KEY (active_company) REFERENCES contact_company(id) "fk_cqgrw75h35ts19uixn03rkjsu" FOREIGN KEY (internal_user) REFERENCES auth_user(id) "fk_k3riohsx7jrhxkxdmxyeqflq1" FOREIGN KEY (updated_by) REFERENCES auth_user(id) "fk_r3e16hs6puibteaby3rk42yg0" FOREIGN KEY (created_by) REFERENCES auth_user(id) "fk_t389sdkhi9owy0xbhec2nqp5w" FOREIGN KEY (agency) REFERENCES contact_agency(id) coopener=# \d+ contact_address Table « public.contact_address » Colonne | Type | Modificateurs | Stockage | Cible de statistiques | Description ----------------------+-----------------------------+---------------+----------+-----------------------+------------- id | bigint | non NULL | plain | | archived | boolean | | plain | | version | integer | | plain | | created_on | timestamp without time zone | | plain | | updated_on | timestamp without time zone | | plain | | addressl2 | character varying(255) | | extended | | addressl3 | character varying(255) | | extended | | addressl4 | character varying(255) | | extended | | addressl5 | character varying(255) | | extended | | addressl6 | character varying(255) | | extended | | certified_ok | boolean | | plain | | consumption_place_ok | boolean | | plain | | full_name | character varying(255) | | extended | | insee_code | character varying(255) | | extended | | koala_id | character varying(255) | | extended | | created_by | bigint | | plain | | updated_by | bigint | | plain | | addressl7country | bigint | | plain | | commune | bigint | | plain | | Index : "contact_address_pkey" PRIMARY KEY, btree (id) "contact_address_address_l7_country_idx" btree (addressl7country) "contact_address_commune_idx" btree (commune) "contact_address_full_name_idx" btree (full_name) Contraintes de clés étrangères : "fk_4yx7nnewflhyjdm5tue5qntbg" FOREIGN KEY (commune) REFERENCES territory_commune(id) "fk_5lwaygtve0ol8ma53picsdef" FOREIGN KEY (addressl7country) REFERENCES territory_country(id) "fk_p9svu5ssynimpuu0is3j396lt" FOREIGN KEY (updated_by) REFERENCES auth_user(id) "fk_rm0lcgnys2n97ad62jkm53qlt" FOREIGN KEY (created_by) REFERENCES auth_user(id) Regards, Laurent
Josh Berkus | 16 Oct 22:06 2014

9.4 performance improvements test

All,

Thought I'd share some pgbench runs I did on two servers, one running
9.3 and one running 9.4.

A small (512MB) pgbench test didn't show much difference between the two:

9.3.5:

transaction type: TPC-B (sort of)
scaling factor: 200
query mode: simple
number of clients: 16
number of threads: 4
duration: 600 s
number of transactions actually processed: 7686217
latency average: 1.249 ms
tps = 12810.135226 (including connections establishing)
tps = 12810.277332 (excluding connections establishing)
statement latencies in milliseconds:
        0.001833        \set nbranches 1 * :scale
        0.000513        \set ntellers 10 * :scale
        0.000447        \set naccounts 100000 * :scale
        0.000597        \setrandom aid 1 :naccounts
        0.000585        \setrandom bid 1 :nbranches
        0.000506        \setrandom tid 1 :ntellers
        0.000507        \setrandom delta -5000 5000
        0.053684        BEGIN;
        0.161115        UPDATE pgbench_accounts SET abalance = abalance
+ :delta WHERE aid = :aid;
        0.143763        SELECT abalance FROM pgbench_accounts WHERE aid
= :aid;
        0.168801        UPDATE pgbench_tellers SET tbalance = tbalance +
:delta WHERE tid = :tid;
        0.183900        UPDATE pgbench_branches SET bbalance = bbalance
+ :delta WHERE bid = :bid;
        0.137570        INSERT INTO pgbench_history (tid, bid, aid,
delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
        0.389587        END;

9.4b3:

transaction type: TPC-B (sort of)
scaling factor: 200
query mode: simple
number of clients: 16
number of threads: 4
duration: 600 s
number of transactions actually processed: 7822118
latency average: 1.227 ms
tps = 13036.312006 (including connections establishing)
tps = 13036.498067 (excluding connections establishing)
statement latencies in milliseconds:
        0.001817        \set nbranches 1 * :scale
        0.000506        \set ntellers 10 * :scale
        0.000439        \set naccounts 100000 * :scale
        0.000587        \setrandom aid 1 :naccounts
        0.000497        \setrandom bid 1 :nbranches
        0.000487        \setrandom tid 1 :ntellers
        0.000506        \setrandom delta -5000 5000
        0.053509        BEGIN;
        0.160929        UPDATE pgbench_accounts SET abalance = abalance
+ :delta WHERE aid = :aid;
        0.145014        SELECT abalance FROM pgbench_accounts WHERE aid
= :aid;
        0.169506        UPDATE pgbench_tellers SET tbalance = tbalance +
:delta WHERE tid = :tid;
        0.188648        UPDATE pgbench_branches SET bbalance = bbalance
+ :delta WHERE bid = :bid;
        0.141014        INSERT INTO pgbench_history (tid, bid, aid,
delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
        0.358340        END;

However, on a big disk-bound database, 9.4 was 20% better throughput.
The database in this case is around 200GB, for a server with 128GB RAM:

9.3.5:

starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10000
query mode: simple
number of clients: 16
number of threads: 4
duration: 3600 s
number of transactions actually processed: 1944320
latency average: 29.625 ms
tps = 539.675140 (including connections establishing)
tps = 539.677426 (excluding connections establishing)

9.4b3:

starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10000
query mode: simple
number of clients: 16
number of threads: 4
duration: 3600 s
number of transactions actually processed: 2422502
latency average: 23.777 ms
tps = 672.816239 (including connections establishing)
tps = 672.821433 (excluding connections establishing)

I suspect this is due to the improvements in writing less to WAL.  If
so, good work, guys!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Emi Lu | 16 Oct 20:30 2014
Picon
Picon

CopyManager(In/out) vs. delete/insert directly

Hello,

Two options for data (>1M), may I know which one better please?

(1) copyOut (JDBC copyManager)
      t1 into a.csv
      delete t2 where pk.cols in t1
      copyIn t2 from a.csv

(2) setautoCommit(false);
      delete t2 where pk.cols in t1;
      insert t2 select * from t1;

Thank you
Emi

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Picon

Partitioned tables and SELECT ... ORDER BY ... LIMIT

Hi,

lets imagine that we have some table, partitioned by timestamp field, and we query it with SELECT with ordering by that field (DESC for example), with some modest limit.
Lets further say that required amount of rows is found in the first table that query encounters (say, latest one).
I am just wondering, why nevertheless PostgreSQL does read couple of buffers from each of the older tables?

Best regards,
Dmitriy Shalashov
Josh Berkus | 16 Oct 00:25 2014

Re: Partitions and work_mem?

On 10/15/2014 01:19 PM, Dave Johansen wrote:
> Sorry I don't understand what you mean by that. My understanding is that
> RedHat maintains fixes for security and other major issues for packages
> that have been EOLed. Are you implying that that's not the case? Or
> something else?

RH probably backpatches our fixes as they come out.  They did in the
past, anyway.

I just had the impression from your original post that this was a new
system; if so, it would make sense to build it on a version of Postgres
which wasn't already EOL.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Josh Berkus | 15 Oct 19:20 2014

Re: Yet another abort-early plan disaster on 9.3

On 10/10/2014 04:16 AM, Greg Stark wrote:
> On Thu, Oct 2, 2014 at 8:56 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
>> Yes, it's only intractable if you're wedded to the idea of a tiny,
>> fixed-size sample.  If we're allowed to sample, say, 1% of the table, we
>> can get a MUCH more accurate n_distinct estimate using multiple
>> algorithms, of which HLL is one.  While n_distinct will still have some
>> variance, it'll be over a much smaller range.
> 
> I've gone looking for papers on this topic but from what I read this
> isn't so. To get any noticeable improvement you need to read 10-50% of
> the table and that's effectively the same as reading the entire table
> -- and it still had pretty poor results. All the research I could find
> went into how to analyze the whole table while using a reasonable
> amount of scratch space and how to do it incrementally.

So, right now our estimation is off on large tables by -10X to -10000X.
 First, the fact that it's *always* low is an indication we're using the
wrong algorithm.  Second, we can most certainly do better than a median
of -1000X.

One interesting set of algorithms is block-based sampling.  That is, you
read 5% of the physical table in random blocks, reading every row in the
block.  The block size is determined by your storage block size, so
you're not actually reading any more physically than you are logically;
it really is just 5% of the table, especially on SSD.

Then you apply algorithms which first estimate the correlation of common
values in the block (i.e. how likely is it that the table is completely
sorted?), and then estimates of how many values there might be total
based on the correlation estimate.

I no longer have my ACM membership, so I can't link this, but
researchers were able to get +/- 3X accuracy for a TPCH workload using
this approach.  A real database would be more variable, of course, but
even so we should be able to achieve +/- 50X, which would be an order of
magnitude better than we're doing now.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Gmane