Alexandre de Arruda Paes | 5 Aug 03:41 2015
Picon

Slow HashAggregate/cache access

Hi,

First, sorry to compare Post with other database system, but I know nothing about Oracle...

This customer have an application made with a framework thats generates the SQL statements (so, We can't make any query optimizations) .

We did the following tests:

1) Postgresql 9.3 and Oracle 10 in a desktop machine(8 GB RAM, 1 SATA disk,Core i5)
2) Postgresql 9.3 in a server + FC storage (128 GB RAM, Xeon 32 cores, SAS disks)

In the first machine, postgresql takes from 20,000 to 40,000 ms to complete the query and from 1,200 to 2,000 ms in the others runs. Oracle in this machine takes 2,000ms in the first run and *70ms* using cache. 

In the second machine, postgresql takes about 2,000ms in the first run and about 800ms in the others. 11x slow than Oracle times, in a much more powefull machine.

Bellow is the 2 explains in the second server:

database=# explain (analyze,buffers) SELECT T1.fr13baixa,T1.fr13dtlanc,T2.fr02empfo,COALESCE( T4.fr13TotQtd, 0) AS fr13TotQtd,T1.fr13codpr,T1.fr13categ,COALESCE( T5.fr13TotBx, 0) AS fr13TotBx,COALESCE( T4.fr13VrTot, 0) AS fr13VrTot,T2.fr09cod, T3.fr09desc, T1.fr02codigo,T1.fr01codemp FROM((((FR13T T1 LEFT JOIN FR02T T2 ON T2.fr01codemp = T1.fr01codemp AND T2.fr02codigo = T1.fr02codigo)LEFT JOIN FR09T T3 ON T3.fr01codemp = T1.fr01codemp AND T3.fr09cod = T2.fr09cod) LEFT JOIN (SELECT SUM(fr13quant) AS fr13TotQtd, fr01codemp, fr02codigo, fr13dtlanc, SUM(COALESCE( fr13quant, 0) * CAST(COALESCE( fr13preco, 0) AS NUMERIC(18,10))) AS fr13VrTot
FROM FR13T1 GROUP BY fr01codemp, fr02codigo, fr13dtlanc ) T4 ON T4.fr01codemp = T1.fr01codemp AND T4.fr02codigo = T1.fr02codigo AND T4.fr13dtlanc = T1.fr13dtlanc) 
LEFT JOIN (SELECT SUM(fr13VrBx) AS fr13TotBx, fr01codemp, fr02codigo, fr13dtlanc FROM FR13T3 GROUP BY fr01codemp, fr02codigo, fr13dtlanc ) T5 ON T5.fr01codemp = T1.fr01codemp AND T5.fr02codigo = T1.fr02codigo AND T5.fr13dtlanc = T1.fr13dtlanc) 
WHERE (T1.fr01codemp = '1' and T1.fr13codpr = '60732' and T1.fr13dtlanc >= '01/05/2014') AND (T1.fr02codigo >= '0' and T1.fr02codigo <= '9999999999') AND (T1.fr13dtlanc <= '31/05/2014') ORDER BY T1.fr01codemp, T1.fr13codpr, T1.fr13dtlanc;
                                                                    QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=30535.97..33804.07 rows=1 width=130) (actual time=1371.548..1728.058 rows=2 loops=1)
   Join Filter: ((fr13t3.fr01codemp = t1.fr01codemp) AND (fr13t3.fr02codigo = t1.fr02codigo) AND (fr13t3.fr13dtlanc = t1.fr13dtlanc))
   Rows Removed by Join Filter: 368
   Buffers: shared hit=95 read=21267
   ->  Nested Loop Left Join  (cost=30529.83..33796.84 rows=1 width=98) (actual time=1345.565..1701.990 rows=2 loops=1)
         Join Filter: (t3.fr01codemp = t1.fr01codemp)
         Buffers: shared hit=95 read=21265
         ->  Nested Loop Left Join  (cost=30529.70..33796.67 rows=1 width=87) (actual time=1340.393..1696.793 rows=2 loops=1)
               Join Filter: ((fr13t1.fr01codemp = t1.fr01codemp) AND (fr13t1.fr02codigo = t1.fr02codigo) AND (fr13t1.fr13dtlanc = t1.fr13dtlanc))
               Rows Removed by Join Filter: 500202
               Buffers: shared hit=93 read=21263
               ->  Nested Loop Left Join  (cost=0.70..2098.42 rows=1 width=23) (actual time=36.424..66.841 rows=2 loops=1)
                     Buffers: shared hit=93 read=88
                     ->  Index Scan using ufr13t2 on fr13t t1  (cost=0.42..2094.11 rows=1 width=19) (actual time=27.518..57.910 rows=2 loops=1)                          
                           Index Cond: ((fr01codemp = 1::smallint) AND (fr13dtlanc >= '2014-05-01'::date) AND (fr13dtlanc <= '2014-05-31'::date))                        
                           Filter: ((fr02codigo >= 0::numeric) AND (fr02codigo <= 9999999999::numeric) AND (fr13codpr = 60732))                                          
                           Rows Removed by Filter: 5621                                                                                                                  
                           Buffers: shared hit=90 read=85                                                                                                                
                     ->  Index Scan using fr02t_pkey on fr02t t2  (cost=0.28..4.30 rows=1 width=12) (actual time=4.455..4.458 rows=1 loops=2)
                           Index Cond: ((fr01codemp = t1.fr01codemp) AND (fr01codemp = 1::smallint) AND (fr02codigo = t1.fr02codigo))
                           Buffers: shared hit=3 read=3
               ->  HashAggregate  (cost=30529.00..30840.80 rows=31180 width=21) (actual time=630.594..753.406 rows=250102 loops=2)
                     Buffers: shared read=21175
                     ->  Seq Scan on fr13t1  (cost=0.00..25072.50 rows=311800 width=21) (actual time=6.354..720.037 rows=311800 loops=1)
                           Filter: (fr01codemp = 1::smallint)
                           Buffers: shared read=21175
         ->  Index Scan using fr09t_pkey on fr09t t3  (cost=0.14..0.16 rows=1 width=15) (actual time=2.584..2.586 rows=1 loops=2)
               Index Cond: ((fr01codemp = 1::smallint) AND (fr09cod = t2.fr09cod))
               Buffers: shared hit=2 read=2
   ->  HashAggregate  (cost=6.14..6.43 rows=29 width=17) (actual time=12.906..12.972 rows=184 loops=2)
         Buffers: shared read=2
         ->  Seq Scan on fr13t3  (cost=0.00..4.30 rows=184 width=17) (actual time=25.570..25.624 rows=184 loops=1)
               Filter: (fr01codemp = 1::smallint)
               Buffers: shared read=2
 Total runtime: 1733.320 ms
(35 rows)

database=# explain (analyze,buffers) SELECT T1.fr13baixa,T1.fr13dtlanc,T2.fr02empfo,COALESCE( T4.fr13TotQtd, 0) AS fr13TotQtd,T1.fr13codpr,T1.fr13categ,COALESCE( T5.fr13TotBx, 0) AS fr13TotBx,COALESCE( T4.fr13VrTot, 0) AS fr13VrTot,T2.fr09cod, T3.fr09desc, T1.fr02codigo,T1.fr01codemp FROM((((FR13T T1 LEFT JOIN FR02T T2 ON T2.fr01codemp = T1.fr01codemp AND T2.fr02codigo = T1.fr02codigo)LEFT JOIN FR09T T3 ON T3.fr01codemp = T1.fr01codemp AND T3.fr09cod = T2.fr09cod) LEFT JOIN (SELECT SUM(fr13quant) AS fr13TotQtd, fr01codemp, fr02codigo, fr13dtlanc, SUM(COALESCE( fr13quant, 0) * CAST(COALESCE( fr13preco, 0) AS NUMERIC(18,10))) AS fr13VrTot
FROM FR13T1 GROUP BY fr01codemp, fr02codigo, fr13dtlanc ) T4 ON T4.fr01codemp = T1.fr01codemp AND T4.fr02codigo = T1.fr02codigo AND T4.fr13dtlanc = T1.fr13dtlanc) 
LEFT JOIN (SELECT SUM(fr13VrBx) AS fr13TotBx, fr01codemp, fr02codigo, fr13dtlanc FROM FR13T3 GROUP BY fr01codemp, fr02codigo, fr13dtlanc ) T5 ON T5.fr01codemp = T1.fr01codemp AND T5.fr02codigo = T1.fr02codigo AND T5.fr13dtlanc = T1.fr13dtlanc) 
WHERE (T1.fr01codemp = '1' and T1.fr13codpr = '60732' and T1.fr13dtlanc >= '01/05/2014') AND (T1.fr02codigo >= '0' and T1.fr02codigo <= '9999999999') AND (T1.fr13dtlanc <= '31/05/2014') ORDER BY T1.fr01codemp, T1.fr13codpr, T1.fr13dtlanc;
                                                                    QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=30535.97..33804.07 rows=1 width=130) (actual time=492.669..763.313 rows=2 loops=1)
   Join Filter: ((fr13t3.fr01codemp = t1.fr01codemp) AND (fr13t3.fr02codigo = t1.fr02codigo) AND (fr13t3.fr13dtlanc = t1.fr13dtlanc))
   Rows Removed by Join Filter: 368
   Buffers: shared hit=21362
   ->  Nested Loop Left Join  (cost=30529.83..33796.84 rows=1 width=98) (actual time=492.462..763.015 rows=2 loops=1)
         Join Filter: (t3.fr01codemp = t1.fr01codemp)
         Buffers: shared hit=21360
         ->  Nested Loop Left Join  (cost=30529.70..33796.67 rows=1 width=87) (actual time=492.423..762.939 rows=2 loops=1)
               Join Filter: ((fr13t1.fr01codemp = t1.fr01codemp) AND (fr13t1.fr02codigo = t1.fr02codigo) AND (fr13t1.fr13dtlanc = t1.fr13dtlanc))
               Rows Removed by Join Filter: 500202
               Buffers: shared hit=21356
               ->  Nested Loop Left Join  (cost=0.70..2098.42 rows=1 width=23) (actual time=0.855..2.268 rows=2 loops=1)
                     Buffers: shared hit=181
                     ->  Index Scan using ufr13t2 on fr13t t1  (cost=0.42..2094.11 rows=1 width=19) (actual time=0.844..2.229 rows=2 loops=1)
                           Index Cond: ((fr01codemp = 1::smallint) AND (fr13dtlanc >= '2014-05-01'::date) AND (fr13dtlanc <= '2014-05-31'::date))
                           Filter: ((fr02codigo >= 0::numeric) AND (fr02codigo <= 9999999999::numeric) AND (fr13codpr = 60732))
                           Rows Removed by Filter: 5621
                           Buffers: shared hit=175
                     ->  Index Scan using fr02t_pkey on fr02t t2  (cost=0.28..4.30 rows=1 width=12) (actual time=0.009..0.012 rows=1 loops=2)
                           Index Cond: ((fr01codemp = t1.fr01codemp) AND (fr01codemp = 1::smallint) AND (fr02codigo = t1.fr02codigo))
                           Buffers: shared hit=6
               ->  HashAggregate  (cost=30529.00..30840.80 rows=31180 width=21) (actual time=229.435..325.660 rows=250102 loops=2)
                     Buffers: shared hit=21175
                     ->  Seq Scan on fr13t1  (cost=0.00..25072.50 rows=311800 width=21) (actual time=0.003..74.088 rows=311800 loops=1)
                           Filter: (fr01codemp = 1::smallint)
                           Buffers: shared hit=21175
         ->  Index Scan using fr09t_pkey on fr09t t3  (cost=0.14..0.16 rows=1 width=15) (actual time=0.023..0.024 rows=1 loops=2)
               Index Cond: ((fr01codemp = 1::smallint) AND (fr09cod = t2.fr09cod))
               Buffers: shared hit=4
   ->  HashAggregate  (cost=6.14..6.43 rows=29 width=17) (actual time=0.065..0.098 rows=184 loops=2)
         Buffers: shared hit=2
         ->  Seq Scan on fr13t3  (cost=0.00..4.30 rows=184 width=17) (actual time=0.006..0.029 rows=184 loops=1)
               Filter: (fr01codemp = 1::smallint)
               Buffers: shared hit=2
 Total runtime: 763.536 ms
(35 rows)


Thanks for any help. 

Best regards,

Alexandre
Ram N | 30 Jul 09:51 2015
Picon

Performance issue with NestedLoop query


Hi,

I am trying to see if I can do anything to optimize the following plan. 

I have two tables and I am doing a join between them. After joining it calculates aggregates (Sum and Count)
Table 1 : timestamp (one per day) for 2 years (730 records)
Table 2 : Window based validity records. Window here means start and end timestamp indicating a period of validity for a record. 
Hash some 10 odd columns including start_time and end_time.  (1 million records)

Machine has 244 GB RAM. Queries are taking more than a min and in some case 2-3 mins.

Below is the plan I am getting. The Nested loop blows up the number of records and we expect that. I have tried playing around work_mem and cache configs which hasn't helped. 

Query
select sum(a), count(id), a.ts, st from table1 a, table2 b where a.ts > b.start_date and a.ts < b.end_date and a.ts > '2015-01-01 20:50:44.000000 +00:00:00' and a.ts < '2015-07-01 19:50:44.000000 +00:00:00' group by a.ts, st order by a.ts

Plan (EXPLAIN ANALYZE)
"Sort  (cost=10005447874.54..10005447879.07 rows=1810 width=44) (actual time=178883.936..178884.159 rows=1355 loops=1)"
"  Output: (sum(b.a)), (count(b.id)), a.ts, b.st"
"  Sort Key: a.ts"
"  Sort Method: quicksort  Memory: 154kB"
"  Buffers: shared hit=47068722 read=102781"
"  I/O Timings: read=579.946"
"  ->  HashAggregate  (cost=10005447758.51..10005447776.61 rows=1810 width=44) (actual time=178882.874..178883.320 rows=1355 loops=1)"
"        Output: sum(b.a), count(b.id), a.ts, b.st"
"        Group Key: a.ts, b.st"
"        Buffers: shared hit=47068719 read=102781"
"        I/O Timings: read=579.946"
"        ->  Nested Loop  (cost=10000000000.43..10004821800.38 rows=62595813 width=44) (actual time=0.167..139484.854 rows=73112419 loops=1)"
"              Output: a.ts, b.st, b.a, b.id"
"              Buffers: shared hit=47068719 read=102781"
"              I/O Timings: read=579.946"
"              ->  Seq Scan on public.table1 a  (cost=0.00..14.81 rows=181 width=8) (actual time=0.058..0.563 rows=181 loops=1)"
"                    Output: a.ts"
"                    Filter: ((a.ts > '2015-01-01 20:50:44+00'::timestamp with time zone) AND (a.ts < '2015-07-01 19:50:44+00'::timestamp with time zone))"
"                    Rows Removed by Filter: 540"
"                    Buffers: shared read=4"
"                    I/O Timings: read=0.061"
"              ->  Index Scan using end_date_idx on public.table2 b  (cost=0.43..23181.37 rows=345833 width=52) (actual time=0.063..622.274 rows=403936 loops=181)"
"                    Output: b.serial_no, b.name, b.st, b.end_date, b.a, b.start_date"
"                    Index Cond: (a.ts < b.end_date)"
"                    Filter: (a.ts > b.start_date)"
"                    Rows Removed by Filter: 392642"
"                    Buffers: shared hit=47068719 read=102777"
"                    I/O Timings: read=579.885"
"Planning time: 0.198 ms"
"Execution time: 178884.467 ms"

Any pointers on how to go about optimizing this? 

--yr

Graeme B. Bell | 28 Jul 22:29 2015
Picon

incredible surprise news from intel/micron right now...

Entering production, availability 2016
1000x faster than nand flash/ssd , eg dram-latency
10x denser than dram
1000x write endurance of nand
Priced between flash and dram
Manufactured by intel/micron
Non-volatile

Guess what's going in my 2016 db servers :-)

Please, don't be vapourware... 

http://hothardware.com/news/intel-and-micron-jointly-drop-disruptive-game-changing-3d-xpoint-cross-point-memory-1000x-faster-than-nand

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Graeme B. Bell | 28 Jul 17:39 2015
Picon

autofreeze/vacuuming - avoiding the random performance hit

Some of you may have had annoying problems in the past with autofreeze or autovacuum running at unexpected
moments and dropping the performance of your server randomly. 

On our SSD-RAID10 based system we found a 20GB table finished it's vacuum freeze in about 100 seconds. There
were no noticeable interruptions to our services; maybe a tiny little bit of extra latency on the web maps,
very hard to tell if it was real or imagination.

If auto-stuff in postgresql has been a pain point for you in the past, I can confirm that SSD drives are a nice
solution (and also for any other autovacuum/analyze type stuff) since they can handle incoming random IO
very nicely while also making very fast progress with the housekeeping work. 

Graeme Bell

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Priyank Tiwari | 28 Jul 09:52 2015
Picon

Any ideas how can I speed up this query?


Hi,

I have following table definition with 6209888 rows in it. It stores the occurrences of species in various regions.

TABLE DEFINITION

    Column    |          Type          |                        Modifiers                         

--------------+------------------------+----------------------------------------------------------

 id           | integer                | not null default nextval('occurrences_id_seq'::regclass)

 gbifid       | integer                | not null

 sname        | character varying(512) | 

 cname        | character varying(512) | 

 species      | character varying(512) | 

 location     | geometry               | not null

 month        | integer                | 

 year         | integer                | 

 event_date   | date                   | 

 dataset_key  | character varying(512) | 

 taxon_key    | character varying(512) | 

 taxon_rank   | character varying(512) | 

 record_basis | character varying(512) | 

 category_id  | integer                | 

 country      | character varying(512) | 

 lat          | double precision       | 

 lng          | double precision       | 

Indexes:

    "occurrences_pkey" PRIMARY KEY, btree (id)

    "unique_occurrences_gbifid" UNIQUE, btree (gbifid)

    "index_occurences_taxon_key" btree (taxon_key)

    "index_occurrences_category_id" btree (category_id)

    "index_occurrences_cname" btree (cname)

    "index_occurrences_country" btree (country)

    "index_occurrences_lat" btree (lat)

    "index_occurrences_lng" btree (lng)

    "index_occurrences_month" btree (month)

    "index_occurrences_sname" btree (sname)

    "occurrence_location_gix" gist (location)


I am trying to fetch the count of number of occurrences within a certain region. I save the location of each occurrence as a geometric field as well as lat, lng combination. Both fields are indexed. The query that is issued is as follows.

QUERY

SELECT COUNT(*) FROM "occurrences" WHERE ("lat" >= -27.91550355958 AND "lat" <= -27.015680440420002 AND "lng" >= 152.13307044728307 AND "lng" <= 153.03137355271693 AND "category_id" = 1 AND (ST_Intersects( ST_Buffer(ST_PointFromText('POINT(152.582222 -27.465592)')::geography, 50000)::geography, location::geography)));

The problem is it takes more than acceptable time to execute the query. Below is the explain analyze output for the same query.

EXPLAIN ANALYZE QUERY OUTPUT  (http://explain.depesz.com/s/p2a)

Aggregate  (cost=127736.06..127736.07 rows=1 width=0) (actual time=13491.678..13491.679 rows=1 loops=1)

   Buffers: shared hit=3 read=56025

   ->  Bitmap Heap Scan on occurrences  (cost=28249.46..127731.08 rows=1995 width=0) (actual time=528.053..13388.458 rows=167511 loops=1)

         Recheck Cond: ((lat >= (-27.91550355958)::double precision) AND (lat <= (-27.01568044042)::double precision) AND (lng >= 152.133070447283::double precision) AND (lng <= 153.031373552717::double precision))

         Rows Removed by Index Recheck: 748669

         Filter: ((category_id = 1) AND ('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography && (location)::geography) AND (_st_distance('0103000020E6100000010000002100000090D8AD28D32263403905504558773BC0CADDAF0384226340E7AD43F4E38D3BC0B559D93A98216340B7BE554692A33BC0C904C18C18206340DF8EA9338DB73BC052F75181131E6340A1D9E30E0FC93BC00BDCB5E39C1B6340A1A40E496AD73BC074D30D03CD1863405DD7BF5110E23BC05DD3A2C0BF156340439B784797E83BC078E9287593126340EF9E5C37BEEA3BC072EB40B9670F63409E25A1BA6FE83BC06964481F5C0C6340B331B5D2C2E13BC08F6785ED8E0963409979FBF9F9D63BC0135DB3E71B0763402F78807480C83BC0E321E8351B0563405E96CB00E6B63BC0672CD874A00363403BC84B1BD9A23BC018FAE8F8B90263400314CD15208D3BC039DE8C4A70026340653B324F91763BC0F5BA5CDFC502634086322DDE0A603BC0E90E8A10B7036340C5FA1C046A4A3BC01ECCC14C3A056340CE4011BC82363BC022F38481400763407655DBB517253BC05B3B5AB6B50963404C9AA306D3163BC079EF01D3810C634010A732D03F0C3BC05152F188890F634044CE5D16C5053BC0EDDABA57AF126340926E14EFA1033BC08D7E9CA3D41563401EFB9E2DEB053BC071A929D5DA186340A3F1F29C8A0C3BC049A7E478A41B63404F8BD2CF3F173BC04B409855161E634057CC1080A2253BC0F9C65E70182063404BEB146926373BC0349D83F596216340449EEA7C204B3BC07803D7FD82226340A16F1747CD603BC090D8AD28D32263403905504558773BC0'::geography, (location)::geography, 0::double precision, false) < 1e-05::double precision))

         Rows Removed by Filter: 6357

         Heap Blocks: exact=29947 lossy=22601

         Buffers: shared hit=3 read=56025

         ->  BitmapAnd  (cost=28249.46..28249.46 rows=32476 width=0) (actual time=519.091..519.091 rows=0 loops=1)

               Buffers: shared read=3477

               ->  Bitmap Index Scan on index_occurrences_lat  (cost=0.00..11691.20 rows=365877 width=0) (actual time=218.999..218.999 rows=392415 loops=1)

                     Index Cond: ((lat >= (-27.91550355958)::double precision) AND (lat <= (-27.01568044042)::double precision))

                     Buffers: shared read=1444

               ->  Bitmap Index Scan on index_occurrences_lng  (cost=0.00..16557.01 rows=517658 width=0) (actual time=285.211..285.211 rows=550523 loops=1)

                     Index Cond: ((lng >= 152.133070447283::double precision) AND (lng <= 153.031373552717::double precision))

                     Buffers: shared read=2033

 Planning time: 2.812 ms

 Execution time: 13493.617 ms

(19 rows)


It seems that the planner is underestimating the number of rows returned in Bitmap Heap Scan on occurrences. I have run vacuum analyze on this table couple of times, but it still produces the same result. Any idea how I can speed up this query? How I can assist planner in providing better row estimates for Bitmap Heap Scan section?

POSTGRESQL VERSION INFO

                                               version                                                

------------------------------------------------------------------------------------------------------

 PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit


HARDWARE

I am running the Postgresql instance on a digital ocean vm with 1 core, SSD disk and 1 GB of ram.


Appreciate your help.

Thanks,
Priyank
Craig James | 25 Jul 16:50 2015

Are many idle connections bad?

The canonical advice here is to avoid more connections than you have CPUs, and to use something like pg_pooler to achieve that under heavy load.

We are considering using the Apache mod_perl "fast-CGI" system and perl's Apache::DBI module, which caches persistent connections in order to improve performance for lightweight web requests. Due to the way our customers are organized (a separate schema per client company), it's possible that there would be (for example) 32 fast-CGI processes, each of which had hundreds of cached connections open at any given time. This would result in a thousand or so Postgres connections on a machine with 32 CPUs.

But, Apache's fast-CGI mechanism allows you to specify the maximum number of fast-CGI processes that can run at one time; requests are queue by the Apache server if the load exceeds this maximum. That means that there would never be more than a configured maximum number of active connections; the rest would be idle.

So we'd have a situation where there there could be thousands of connections, but the actual workload would be throttled to any limit we like. We'd almost certainly limit it to match the number of CPUs.

So the question is: do idle connections impact performance?

Thanks,
Craig

Laurent Debacker | 23 Jul 18:58 2015
Picon

bitmap heap scan recheck for gin/fts with no lossy blocks

Hi,

I have read that GIN indexes don't require a recheck cond for full text search as long as work_mem is big enough, otherwise you get lossy blocks, and the recheck cond.

In my case, I have no lossy blocks (from what I could tell), but I do have a recheck...

EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(1) FROM enterprises WHERE fts <at> <at> 'activ'::tsquery

"Aggregate  (cost=264555.07..264555.08 rows=1 width=0) (actual time=25813.920..25813.921 rows=1 loops=1)"
"  Buffers: shared hit=1 read=178192"
"  ->  Bitmap Heap Scan on enterprises  (cost=5004.86..263202.54 rows=541014 width=0) (actual time=170.546..25663.048 rows=528376 loops=1)"
"        Recheck Cond: (fts <at> <at> '''activ'''::tsquery)"
"        Heap Blocks: exact=178096"
"        Buffers: shared hit=1 read=178192"
"        ->  Bitmap Index Scan on enterprises_fts_idx  (cost=0.00..4869.61 rows=541014 width=0) (actual time=120.214..120.214 rows=528376 loops=1)"
"              Index Cond: (fts <at> <at> '''activ'''::tsquery)"
"              Buffers: shared hit=1 read=96"
"Planning time: 2.383 ms"
"Execution time: 25824.476 ms"

Any advice would be greatly appreciated. I'm running PostgreSQL 9.4.1.

Thank you,

Laurent Debacker
Graeme B. Bell | 23 Jul 15:45 2015
Picon

parallelisation provides postgres performance (script example + ppt slides)


Hi all,

1. For those that don't like par_psql (http://github.com/gbb/par_psql), here's an alternative
approach that uses the Gnu Parallel command to organise parallelism for queries that take days to run
usually. Short script and GIS-focused, but may give you a few ideas about how to parallelise your own code
with Gnu Parallel. 

https://github.com/gbb/fast_map_intersection

2. Also, I gave a talk at FOSS4G Como about these tools, and how to get better performance from your DB with
parallelisation. May be helpful to people who are new to parallelisation / multi-core work with
postgres. 

http://graemebell.net/foss4gcomo.pdf          

Graeme Bell.

p.s.  (this version of the slides still has a few typos, which will be fixed soon when I get the source ppts back
from my colleague's laptop).

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Moreno Andreo | 22 Jul 14:50 2015
Picon

How to find the culprit in server load spikes?

Hi everyone,
     I host a Postgresql server on Ubuntu 12.04 and I am facing server 
load spikes (if I run top, it goes up to 25-30 on a 4-core system)...
In some cases, I have to restart potgresql service because users call us 
complaining of the slowness, but in some cases I can leave things on 
their way and I see that after a bunch of minutes (about 5-10) the 
situations drops to the normality (0.50-2 load).
The problem is, as in the most cases, the I/O, but I need a small hand 
to know some methods or tools that can help me to investigate who or 
what is causing me these spikes.

Any help would be appreciated.
Best regards,
Moreno.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Jeison Bedoya Delgado | 21 Jul 04:59 2015
Picon

hyperthreadin low performance

hi everyone,

Recently update a database to machine with RHEL7, but i see that the 
performance is betther if the hyperthreading tecnology is deactivated 
and use only 32 cores.

is normal that the machine performance is better with 32 cores that 64 
cores?.

BD: postgresql 9.3.5
Machine: Dell PE R820
processor:  4x Intel(R) Xeon(R) CPU E5-4620 v2  <at>  2.60GHz eigth-core
RAM: 128GB
Storage SSD SAN

thanks by your help

-- 
Atentamente,

JEISON BEDOYA DELGADO
ADM.Servidores y comunicaciones
AUDIFARMA S.A.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Robert DiFalco | 15 Jul 18:16 2015
Picon

Insert vs Update

First off I apologize if this is question has been beaten to death. I've looked around for a simple answer and could not find one. 

Given a database that will not have it's PKEY or indices modified, is it generally faster to INSERT or UPDATE data. And if there is a performance difference is it substantial?

I have a situation where I can easily do one or the other to the same effect. For example, I have a journaling schema with a limited number of "states" for an "entry". Currently each state is it's own table so I just insert them as they occur. But I could easily have a single "entry" table where the row is updated with column information for states (after the entry's initial insertion). 

Not a big deal but since it's so easy for me to take either approach I was wondering if one was more efficient (for a large DB) than another. 

Thanks!

Gmane