Josh Berkus | 18 Sep 19:54 2014

Re: postgres 9.3 vs. 9.4

On 09/18/2014 08:09 AM, Mkrtchyan, Tigran wrote:
>>> 9.4beta2:
>>> > >
>> > ...
>> > 
>>> > >         0.957854        END;
>>> > >
>> > 
>> > Looks like IO.
> Postgres internal IO? May be. We get 600MB/s on this SSDs.

While it's possible that this is a Postgres issue, my first thought is
that the two SSDs are not actually identical.  The 9.4 one may either
have a fault, or may be mostly full and heavily fragmented.  Or the Dell
PCIe card may have an issue.

You are using "scale 1" which is a < 1MB database, and one client and 1
thread, which is an interesting test I wouldn't necessarily have done
myself.  I'll throw the same test on one of my machines and see how it does.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

(Continue reading)

Mkrtchyan, Tigran | 18 Sep 11:58 2014
Picon

postgres 9.3 vs. 9.4


Hi Folk, 

I am trying to investigate some performance issues which we have with postgres
(a different topic by itself) and tried postgres.9.4beta2, with a hope that it
perform better.

Turned out that 9.4 is 2x slower than 9.3.5 on the same hardware.

Some technical details:

  Host: rhel 6.5 2.6.32-431.23.3.el6.x86_64
  256 GB RAM, 40 cores, Intel(R) Xeon(R) CPU E5-2660 v2  <at>  2.20GHz
  2x160GB  PCIe SSD DELL_P320h-MTFDGAL175SAH ( on one 9.3, on an other one 9.4 )

postgres tweaks:

default_statistics_target = 100
wal_writer_delay = 10s
vacuum_cost_delay = 50
synchronous_commit = off
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
effective_cache_size = 94GB
work_mem = 402MB
wal_buffers = 16MB
checkpoint_segments = 64
shared_buffers = 8GB
max_connections = 100
random_page_cost = 1.5
(Continue reading)

Alexander Hill | 17 Sep 06:56 2014
Picon

Aggregating tsqueries

Hello,

I have a table of tree nodes with a tsquery column. To get a subtree's tsquery, I need to OR all of its nodes' tsqueries together.

I defined a custom aggregate using tsquery_or:

    CREATE AGGREGATE tsquery_or_agg (tsquery)
    (
        sfunc = tsquery_or,
        stype = tsquery
    );

but I've found that

    tsquery_or_agg(query)

is about a hundred times slower than this:

    ('(' || string_agg(query::text, ')|(') || ')')::tsquery

That works perfectly so I'm happy to continue doing it, but I'm curious to know why the difference is so great and if anything can be done about it?

Cheers,
Alex
Josh Berkus | 18 Sep 02:11 2014

Yet another abort-early plan disaster on 9.3

Folks,

Just encountered another case of critical fail for abort-early query
plans.  In this case, it will completely prevent a user from upgrading
to 9.3; this is their most common query, and on 9.3 it takes 1000X longer.

Maybe we should think about removing abort-early plans from 9.5?
Clearly we don't understand them well enough for them to work for users.

Query:

SELECT "categories".* FROM "categories" WHERE "categories"."user_id" IN
( SELECT to_user_id FROM "tags" WHERE "tags"."from_user_id" = 53529975 )
ORDER BY recorded_on DESC LIMIT 20;

Here's the plan from 9.1:

 Limit  (cost=1613.10..1613.15 rows=20 width=194) (actual
time=0.503..0.509 rows=20 loops=1)
   ->  Sort  (cost=1613.10..1736.14 rows=49215 width=194) (actual
time=0.502..0.505 rows=20 loops=1)
         Sort Key: categories.recorded_on
         Sort Method: top-N heapsort  Memory: 30kB
         ->  Nested Loop  (cost=248.80..303.51 rows=49215 width=194)
(actual time=0.069..0.347 rows=81 loops=1)
               ->  HashAggregate  (cost=248.80..248.81 rows=1 width=4)
(actual time=0.050..0.054 rows=8 loops=1)
                     ->  Index Scan using unique_index_tags on tags
(cost=0.00..248.54 rows=103 width=4) (actual time=0.020..0.033 rows=8
loops=1)
                           Index Cond: (from_user_id = 53529975)
               ->  Index Scan using index_categories_on_user_id on
categories  (cost=0.00..54.34 rows=29 width=194) (actual
time=0.010..0.028 rows=10 loops=8)
                     Index Cond: (user_id = tags.to_user_id)
 Total runtime: 0.641 ms

And from 9.3:

 Limit  (cost=1.00..2641.10 rows=20 width=202) (actual
time=9.933..711.372 rows=20 loops=1)
   ->  Nested Loop Semi Join  (cost=1.00..9641758.39 rows=73041
width=202) (actual time=9.931..711.361 rows=20 loops=1)
         ->  Index Scan Backward using index_categories_on_recorded_on
on categories  (cost=0.43..406943.98 rows=4199200 width=202) (actual
time=0.018..275.020 rows=170995 loops=1)
         ->  Index Scan using unique_index_tags on tags
(cost=0.57..2.20 rows=1 width=4) (actual time=0.002..0.002 rows=0
loops=170995)
               Index Cond: ((from_user_id = 53529975) AND (to_user_id =
categories.user_id))
 Total runtime: 711.457 ms

So, here's what's happening here:

As usual, PostgreSQL is dramatically undercounting n_distinct: it shows
chapters.user_id at 146,000 and the ratio of to_user_id:from_user_id as
being 1:105 (as opposed to 1:6, which is about the real ratio).  This
means that PostgreSQL thinks it can find the 20 rows within the first 2%
of the index ... whereas it actually needs to scan 50% of the index to
find them.

Removing LIMIT causes 9.3 to revert to the "good" plan, as expected.

This is the core issue with abort-early plans; they depend on our
statistics being extremely accurate, which we know they are not. And if
they're wrong, the execution time climbs by 1000X or more.  Abort-early
plans are inherently riskier than other types of query plans.

What I'm not clear on is why upgrading from 9.1 to 9.3 would bring about
this change.  The stats are no more than 10% different across the
version change.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Huang, Suya | 17 Sep 02:21 2014

How to interpret view pg_stat_bgwriter

Hi,

 

Sorry for send this email twice but it seems it fits the performance group than admin group…

 

I was reading an article of Gregory Smith http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm and tried to do some analysis on our database.

 

postgres=#  select * from pg_stat_bgwriter;

-[ RECORD 1 ]------+------------

checkpoints_timed  | 42435

checkpoints_req    | 629448

buffers_checkpoint | 1821978480

buffers_clean      | 117710078

maxwritten_clean   | 23796

buffers_backend    | 1284631340

buffers_alloc      | 32829025268

 

postgres=# show checkpoint_segments ;

-[ RECORD 1 ]-------+----

checkpoint_segments | 128

 

 

postgres=# show checkpoint_timeout ;

-[ RECORD 1 ]------+------

checkpoint_timeout | 10min

 

bgwriter_delay           bgwriter_lru_maxpages    bgwriter_lru_multiplier

postgres=# show bgwriter_delay;

-[ RECORD 1 ]--+------

bgwriter_delay | 100ms

 

postgres=# show bgwriter_lru_maxpages;

-[ RECORD 1 ]---------+-----

bgwriter_lru_maxpages | 1000

 

postgres=# show bgwriter_lru_multiplier;

-[ RECORD 1 ]-----------+--

bgwriter_lru_multiplier | 5

 

based on one snapshot, below are my thoughts after reading the example reading the example Greg used, it might be completely wrong as I’m just starting the learning process of checkpoint mechanism in PG.  If anything missing/wrong, appreciate if you can help to point out.

 

# checkpoints_req is much bigger than checkpoints_timed, suggest that I may increase checkpoint_segments in our system

#maxwritten_clean is high, suggests increase bgwriter_lru_maxpages

# buffers_backend is much smaller than buffers_alloc, suggests increasing bgwriter_lru_maxpages, bgwriter_lru_multiplier, and decreasing bgwriter_delay.

 

 

Thanks,

Suya

 

Van Der Berg, Stefan | 15 Sep 11:38 2014
Picon

Strange performance problem with query

Hi All,

Please see the output from the following query analysis :
=# explain analyze select count(1) from jbpmprocess.jbpm_taskinstance ti 
join jbpmprocess.jbpm_task task on (ti.task_ = task.id_ ) join 
jbpmprocess.jbpm_processinstance pi on ti.procinst_ = pi.id_ where 
ti.isopen_ = true;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=47372.04..47372.05 rows=1 width=0) (actual 
time=647.070..647.071 rows=1 loops=1)
    ->  Hash Join  (cost=44806.99..47336.72 rows=14127 width=0) (actual 
time=605.077..645.410 rows=20359 loops=1)
          Hash Cond: (ti.task_ = task.id_)
          ->  Hash Join  (cost=44779.80..47115.28 rows=14127 width=8) 
(actual time=604.874..640.541 rows=20359 loops=1)
                Hash Cond: (ti.procinst_ = pi.id_)
                ->  Index Scan using idx_task_instance_isopen on 
jbpm_taskinstance ti  (cost=0.00..1995.84 rows=22672 width=16) (actual 
time=0.011..16.606 rows=20359 loops=1)
                      Index Cond: (isopen_ = true)
                      Filter: isopen_
                ->  Hash  (cost=28274.91..28274.91 rows=1320391 width=8) 
(actual time=604.601..604.601 rows=1320391 loops=1)
                      Buckets: 262144  Batches: 1  Memory Usage: 51578kB
                      ->  Seq Scan on jbpm_processinstance pi 
(cost=0.00..28274.91 rows=1320391 width=8) (actual time=0.004..192.166 
rows=1320391 loops=1)
          ->  Hash  (cost=18.75..18.75 rows=675 width=8) (actual 
time=0.196..0.196 rows=675 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 27kB
                ->  Seq Scan on jbpm_task task  (cost=0.00..18.75 
rows=675 width=8) (actual time=0.003..0.106 rows=675 loops=1)
  Total runtime: 652.266 ms
(15 rows)

I'm not sure why the planner insists on doing the sequential scan on  
jbpm_processinstance even though the 22672 rows from jbpm_taskinstance 
it has to match it against, is only 1% of the number of rows in 
jbpm_processinstance. So far I think it is because the values in 
procinst_ of jbpm_taskinstance are not entirely unique.

The very strange thing though is the way the query plan changes if I 
repeat the where clause :

explain analyze select count(1) from jbpmprocess.jbpm_taskinstance ti 
join jbpmprocess.jbpm_task task on (ti.task_ = task.id_ ) join 
jbpmprocess.jbpm_processinstance pi on ti.procinst_ = pi.id_ where 
ti.isopen_ = true and ti.isopen_ = true;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=2074.61..2074.62 rows=1 width=0) (actual 
time=80.126..80.126 rows=1 loops=1)
    ->  Hash Join  (cost=27.19..2074.24 rows=151 width=0) (actual 
time=0.217..77.959 rows=20359 loops=1)
          Hash Cond: (ti.task_ = task.id_)
          ->  Nested Loop  (cost=0.00..2044.97 rows=151 width=8) (actual 
time=0.016..71.429 rows=20359 loops=1)
                ->  Index Scan using idx_task_instance_isopen on 
jbpm_taskinstance ti  (cost=0.00..29.72 rows=243 width=16) (actual 
time=0.012..16.928 rows=20359 loops=1)
                      Index Cond: ((isopen_ = true) AND (isopen_ = true))
                      Filter: (isopen_ AND isopen_)
                ->  Index Scan using jbpm_processinstance_pkey on 
jbpm_processinstance pi  (cost=0.00..8.28 rows=1 width=8) (actual 
time=0.002..0.002 rows=1 loops=20359)
                      Index Cond: (id_ = ti.procinst_)
          ->  Hash  (cost=18.75..18.75 rows=675 width=8) (actual 
time=0.196..0.196 rows=675 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 27kB
                ->  Seq Scan on jbpm_task task  (cost=0.00..18.75 
rows=675 width=8) (actual time=0.002..0.107 rows=675 loops=1)
  Total runtime: 80.170 ms

I get a similar plan selected on the original query if I set 
enable_seqscan to off. I much prefer the second result.
My questions are:
1. Why is this happening?
2. How can I encourage the behavior of the second query without changing 
the original query? Is there some column level setting I can set?

(BTW the tables are analyzed, and I currently have no special 
settings/attributes set for any of the tables.)

-- 
Kind Regards
Stefan

Cell : 072-380-1479
Desk : 087-577-7241
To read FirstRand Bank's Disclaimer for this email click on the following address or copy into your
Internet browser: 
https://www.fnb.co.za/disclaimer.html 

If you are unable to access the Disclaimer, send a blank e-mail to
firstrandbankdisclaimer <at> fnb.co.za and we will send you a copy of the Disclaimer.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Huang, Suya | 12 Sep 04:26 2014

weird execution plan

Hi,

 

Can someone figure out why the first query runs so slow  comparing to the second one? They generate the same result…

 

dev=# explain analyze select count(distinct wid) from terms_weekly_20140503 a join port_terms b on a.term=b.terms;

                                                                       QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=2226181.12..2226181.13 rows=1 width=516) (actual time=18757.318..18757.319 rows=1 loops=1)

   ->  Hash Join  (cost=37.67..2095240.22 rows=52376358 width=516) (actual time=0.758..2496.190 rows=1067696 loops=1)

         Hash Cond: (a.term = b.terms)

         ->  Seq Scan on terms_weekly_20140503 a  (cost=0.00..240738.81 rows=8516481 width=548) (actual time=0.009..951.875 rows=8516481 loops=1)

         ->  Hash  (cost=22.30..22.30 rows=1230 width=32) (actual time=0.690..0.690 rows=1000 loops=1)

               Buckets: 1024  Batches: 1  Memory Usage: 51kB

               ->  Seq Scan on port_terms b  (cost=0.00..22.30 rows=1230 width=32) (actual time=0.009..0.283 rows=1000 loops=1)

Total runtime: 18757.367 ms

(8 rows)

 

Time: 18758.068 ms

 

dev=# explain analyze with x as (select distinct wid from terms_weekly_20140503 a join port_terms b on a.term=b.terms) select count(*) from x;

                                                                           QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=2226187.62..2226187.63 rows=1 width=0) (actual time=2976.011..2976.011 rows=1 loops=1)

   CTE x

     ->  HashAggregate  (cost=2226181.12..2226183.12 rows=200 width=516) (actual time=2827.958..2896.747 rows=212249 loops=1)

           ->  Hash Join  (cost=37.67..2095240.22 rows=52376358 width=516) (actual time=0.734..2470.533 rows=1067696 loops=1)

                 Hash Cond: (a.term = b.terms)

                 ->  Seq Scan on terms_weekly_20140503 a  (cost=0.00..240738.81 rows=8516481 width=548) (actual time=0.009..916.028 rows=8516481 loops=1)

                 ->  Hash  (cost=22.30..22.30 rows=1230 width=32) (actual time=0.669..0.669 rows=1000 loops=1)

                       Buckets: 1024  Batches: 1  Memory Usage: 51kB

                       ->  Seq Scan on port_terms b  (cost=0.00..22.30 rows=1230 width=32) (actual time=0.009..0.269 rows=1000 loops=1)

   ->  CTE Scan on x  (cost=0.00..4.00 rows=200 width=0) (actual time=2827.961..2963.878 rows=212249 loops=1)

Total runtime: 2980.681 ms

(11 rows)

 

Thanks,

Suya

Jeff Janes | 11 Sep 20:09 2014
Picon

Re: how to change the provoke table in hash join


On Thu, Sep 11, 2014 at 7:09 AM, Matheus de Oliveira <matioli.matheus <at> gmail.com> wrote:

On Wed, Sep 10, 2014 at 10:05 PM, Huang, Suya <Suya.Huang <at> au.experian.com> wrote:

--plan 1, 10 seconds were spent on sequential scan on term_weekly table.

 

dev=# explain analyze select distinct  cs_id from lookup_weekly  n inner join term_weekly s on s.b_id=n.b_id and s.date=n.date where term in ('cat'::text);

 

 

                                                                           QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------

HashAggregate  (cost=2100211.06..2100211.11 rows=5 width=4) (actual time=27095.470..27095.487 rows=138 loops=1)

...

 

--plan 2, only 1 second spent on index scan of term_weekly table, however, as it selects the big table to do the hashing, it takes 22 seconds for the hash to complete. The advantage get from index has been totally lost because of this join order.

 

                                                                                        QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HashAggregate  (cost=1429795.17..1429795.22 rows=5 width=4) (actual time=22991.289..22991.307 rows=138 loops=1)

...

Am I reading something wrong here? I haven't looked all the plan, but the second is faster (overall), so why do you think you need a hint or change what the planner choose? For me looks like using the index is the best for this situation. Could you try running this multiple times and taking the min/max/avg time of both?

The difference in time could be a caching effect, not a reproducible difference.

The 2nd plan uses 3GB of memory, and there might be better uses for that memory.

Currently memory is un-costed, other than "cliff costing" once you thinks it will exceed work_mem, which I think is a problem.  Just because I will let you use 4GB of memory if you will really benefit from it, doesn't mean you should use 4GB gratuitously.


Suya, what happens if you lower work_mem setting?  Does it revert to the plan you want?

Cheers,

Jeff
Huang, Suya | 11 Sep 03:05 2014

how to change the provoke table in hash join

Hi,

 

Below are two query plan for same SQL with and without an index. I noticed the Hash join order has changed since index has been created and this is not what I want. As it’s hashing the big table and to provoke records in  a small table. in Oracle, it’s simple to add hint to point the table you’d like to be used as the provoke table. However, in Postgres, I don’t know how to change the behavior.

 

--plan 1, 10 seconds were spent on sequential scan on term_weekly table.

 

dev=# explain analyze select distinct  cs_id from lookup_weekly  n inner join term_weekly s on s.b_id=n.b_id and s.date=n.date where term in ('cat'::text);

 

 

                                                                           QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------

HashAggregate  (cost=2100211.06..2100211.11 rows=5 width=4) (actual time=27095.470..27095.487 rows=138 loops=1)

   ->  Hash Join  (cost=954343.95..2100211.04 rows=5 width=4) (actual time=24088.912..27095.206 rows=160 loops=1)

         Hash Cond: (((n.b_id)::text = (s.b_id)::text) AND (n.date = s.date))

         ->  Append  (cost=0.00..862153.59 rows=37828460 width=52) (actual time=0.016..10923.091 rows=37828459 loops=1)

               ->  Seq Scan on lookup_weekly n  (cost=0.00..0.00 rows=1 width=524) (actual time=0.001..0.001 rows=0 loops=1)

               ->  Seq Scan on lookup_weekly_20131130 n_1  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.015..1229.217 rows=5158718 loops=1)

               ->  Seq Scan on lookup_weekly_20131207 n_2  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=5.225..1177.539 rows=5158718 loops=1)

               ->  Seq Scan on lookup_weekly_20131214 n_3  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=5.756..1274.135 rows=5158718 loops=1)

               ->  Seq Scan on lookup_weekly_20131221 n_4  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=4.269..1131.570 rows=5158718 loops=1)

               ->  Seq Scan on lookup_weekly_20131228 n_5  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=9.383..1110.435 rows=5158718 loops=1)

               ->  Seq Scan on lookup_weekly_20140426 n_6  (cost=0.00..91715.42 rows=4042442 width=52) (actual time=8.137..947.724 rows=4042442 loops=1)

               ->  Seq Scan on lookup_weekly_20140503 n_7  (cost=0.00..93516.49 rows=4118149 width=52) (actual time=7.717..791.339 rows=4118149 loops=1)

               ->  Seq Scan on lookup_weekly_20140329 n_8  (cost=0.00..88100.78 rows=3874278 width=52) (actual time=0.004..637.297 rows=3874278 loops=1)

         ->  Hash  (cost=954343.47..954343.47 rows=32 width=61) (actual time=10604.327..10604.327 rows=553 loops=1)

               Buckets: 1024  Batches: 1  Memory Usage: 43kB

               ->  Append  (cost=0.00..954343.47 rows=32 width=61) (actual time=10.009..10602.075 rows=553 loops=1)

                     ->  Seq Scan on term_weekly s  (cost=0.00..0.00 rows=1 width=520) (actual time=0.000..0.000 rows=0 loops=1)

                           Filter: (term = 'cat'::text)

                     ->  Seq Scan on term_weekly_20140503 s_1  (cost=0.00..262030.12 rows=8 width=46) (actual time=10.007..3738.945 rows=166 loops=1)

                           Filter: (term = 'cat'::text)

                           Rows Removed by Filter: 8516324

                     ->  Seq Scan on term_weekly_20140510 s_2  (cost=0.00..246131.35 rows=8 width=46) (actual time=52.059..2316.793 rows=152 loops=1)

                           Filter: (term = 'cat'::text)

                           Rows Removed by Filter: 8010196

                     ->  Seq Scan on term_weekly_20140517 s_3  (cost=0.00..233644.94 rows=8 width=46) (actual time=26.661..2504.273 rows=135 loops=1)

                           Filter: (term = 'cat'::text)

                           Rows Removed by Filter: 7632420

                     ->  Seq Scan on term_weekly_20140524 s_4  (cost=0.00..212537.06 rows=7 width=46) (actual time=49.773..2041.578 rows=100 loops=1)

                           Filter: (term = 'cat'::text)

                           Rows Removed by Filter: 6950865

Total runtime: 27095.639 ms

(31 rows)

 

--plan 2, only 1 second spent on index scan of term_weekly table, however, as it selects the big table to do the hashing, it takes 22 seconds for the hash to complete. The advantage get from index has been totally lost because of this join order.

 

                                                                                        QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HashAggregate  (cost=1429795.17..1429795.22 rows=5 width=4) (actual time=22991.289..22991.307 rows=138 loops=1)

   ->  Hash Join  (cost=1429580.49..1429795.15 rows=5 width=4) (actual time=22963.340..22991.214 rows=160 loops=1)

         Hash Cond: (((s.b_id)::text = (n.b_id)::text) AND (s.date = n.date))

         ->  Append  (cost=0.00..142.77 rows=32 width=61) (actual time=0.052..1.125 rows=553 loops=1)

               ->  Seq Scan on term_weekly s  (cost=0.00..0.00 rows=1 width=520) (actual time=0.001..0.001 rows=0 loops=1)

                     Filter: (term = 'cat'::text)

               ->  Index Scan using idx_term_weekly_20140503_3 on term_weekly_20140503 s_1  (cost=0.56..36.70 rows=8 width=46) (actual time=0.051..0.353 rows=166 loops=1)

                     Index Cond: (term = 'cat'::text)

               ->  Index Scan using idx_term_weekly_20140510_3 on term_weekly_20140510 s_2  (cost=0.56..36.70 rows=8 width=46) (actual time=0.043..0.293 rows=152 loops=1)

                     Index Cond: (term = 'cat'::text)

               ->  Index Scan using idx_term_weekly_20140517_3 on term_weekly_20140517 s_3  (cost=0.56..36.70 rows=8 width=46) (actual time=0.029..0.244 rows=135 loops=1)

                     Index Cond: (term = 'cat'::text)

               ->  Index Scan using idx_term_weekly_20140524_3 on term_weekly_20140524 s_4  (cost=0.56..32.68 rows=7 width=46) (actual time=0.024..0.192 rows=100 loops=1)

                     Index Cond: (term = 'cat'::text)

         ->  Hash  (cost=862153.59..862153.59 rows=37828460 width=52) (actual time=22939.457..22939.457 rows=37828459 loops=1)

               Buckets: 4194304  Batches: 1  Memory Usage: 3144960kB

               ->  Append  (cost=0.00..862153.59 rows=37828460 width=52) (actual time=0.010..9100.690 rows=37828459 loops=1)

                     ->  Seq Scan on lookup_weekly n  (cost=0.00..0.00 rows=1 width=524) (actual time=0.001..0.001 rows=0 loops=1)

                     ->  Seq Scan on lookup_weekly_20131130 n_1  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.008..1099.194 rows=5158718 loops=1)

                     ->  Seq Scan on lookup_weekly_20131207 n_2  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.004..861.678 rows=5158718 loops=1)

                     ->  Seq Scan on lookup_weekly_20131214 n_3  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.004..860.374 rows=5158718 loops=1)

                     ->  Seq Scan on lookup_weekly_20131221 n_4  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.003..852.169 rows=5158718 loops=1)

                     ->  Seq Scan on lookup_weekly_20131228 n_5  (cost=0.00..117764.18 rows=5158718 width=52) (actual time=0.005..835.201 rows=5158718 loops=1)

                     ->  Seq Scan on lookup_weekly_20140426 n_6  (cost=0.00..91715.42 rows=4042442 width=52) (actual time=0.005..663.261 rows=4042442 loops=1)

                     ->  Seq Scan on lookup_weekly_20140503 n_7  (cost=0.00..93516.49 rows=4118149 width=52) (actual time=0.006..678.281 rows=4118149 loops=1)

                     ->  Seq Scan on lookup_weekly_20140329 n_8  (cost=0.00..88100.78 rows=3874278 width=52) (actual time=0.003..635.296 rows=3874278 loops=1)

Total runtime: 22995.361 ms

(27 rows)

 

Thanks,

Suya

Shadin A | 2 Sep 16:44 2014
Picon

Implementing a functionality for processing heavy insertion

Hello there,

I use PostgreSQL 9.1
The scenario is:
I receive heavy insertion into Table1 (about 100 rows a sec). For each new entry, I have to check it with the previous and next ones (check if those items are inside an area using ST_DWithin). Depending on the result, what I need to do is: use the new entry (joining another table) to insert/update into new table.
My Questions are:
  • I used to use Trigger to do the checking and insertion/updating. the header is as follows:
CREATE TRIGGER ts_trigger AFTER INSERT ON table1 FOR EACH ROW EXECUTE PROCEDURE test_trigger();
But I don't think it's the efficient way to do it.
  • I'm inserted in using batches. I'd like to understand the technique but cannot find a good resources for this.
Some advise me to use temp table, but I don't think it would be useful in my case.
Xiaoyulei | 2 Sep 08:59 2014

why after increase the hash table partitions, tpmc decrease

 

We use benchmarksql to start tpcc test in postgresql 9.3.3.

Before test we set benchmarksql client number about 800. And we increase the hash partitions from 16 to 1024 , in order to reduce the hash partition locks competition.

We expect that after increase the number of partitions, reduces lock competition, TPMC should be increased. But the test results on the contrary, after modified to 1024, TPMC did not increase, but decrease.

Why such result?

 

We modify the following macro definition:

NUM_BUFFER_PARTITIONS 1024

LOG2_NUM_PREDICATELOCK_PARTITIONS 10

LOG2_NUM_LOCK_PARTITIONS 10

 


Gmane