Tory M Blue | 26 Aug 21:14 2015
Picon

Index creation running now for 14 hours

I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it started this particular tables index creation at 10:16pm and it's still running. 1 single core is at 100% (32 core box) and there is almost zero I/O activity.

CentOS 6.6


 16398 | clsdb | 25765 |       10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 |                 |       45712 | 2015-08-25 21:12:01.6

19819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  | 2015-08-25 22:16:03.100992-07 | f       | active | select "_cls".fini

shTableAfterCopy(143); analyze "torque"."impressions"; 

I was wondering if there were underlying tools to see how it's progressing, or if there is anything I can do to bump the performance mid creation? Nothing I can do really without stopping postgres or slon, but that would start me back at square one.


Thanks

Tory



iostat: sdb is the db directory 


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.55    0.00    0.23    0.00    0.00   96.22


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.00         0.00        12.00          0         24

sdb               0.00         0.00         0.00          0          0


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.57    0.00    0.06    0.00    0.00   96.37


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb              21.50         0.00     15484.00          0      30968


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.72    0.00    0.06    0.00    0.00   96.22


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               2.00         0.00        20.00          0         40

sdb               0.00         0.00         0.00          0          0


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           4.06    0.00    0.05    0.02    0.00   95.87


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               4.00         0.00        64.00          0        128

sdb               3.50         0.00       108.00          0        216


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.36    0.00    0.03    0.00    0.00   96.61


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb               0.00         0.00         0.00          0          0


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.41    0.00    0.06    0.00    0.00   96.53


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb               0.00         0.00         0.00          0          0


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.45    0.00    0.27    0.00    0.00   96.28


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb               1.00         0.00        24.00          0         48


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           3.50    0.00    0.30    0.00    0.00   96.20


Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.50         0.00       344.00          0        688

sdb               0.00         0.00         0.00          0          0


Henrik Thostrup Jensen | 25 Aug 15:11 2015
Picon

Gist indexing performance with cidr types

Hi

I'm trying to get a query to run fast enough for interactive use. I've gotten
some speed-up, but still not there. It is for a tool called IRRExplorer
(http://irrexplorer.nlnog.net/) that correlates IP routes between Internet
Route Registries and real-world routing information. We landed on PostgreSQL
largely due to indexing of the cidr type with gist indexing.

* Preliminaries:

The query is about getting data from this table:

irrexplorer=> \d routes
       Table "public.routes"
   Column   |  Type   | Modifiers 
-----------+---------+-----------
  route     | cidr    | not null
  asn       | bigint  | not null
  source_id | integer | not null
Indexes:
     "unique_entry" UNIQUE CONSTRAINT, btree (route, asn, source_id)
     "route_gist" gist (route inet_ops)
Check constraints:
     "positive_asn" CHECK (asn > 0)
Foreign-key constraints:
     "routes_source_id_fkey" FOREIGN KEY (source_id) REFERENCES sources(id)

Complete DDL: https://github.com/job/irrexplorer/blob/master/data/schema.sql

Data set: 125 MB on disk, 2.3 million rows.

Running stock PostgreSQL 9.4.4 on Ubuntu 14.10 (hosted on VMWare on OS X)

Have done VACUUM, ANALYZE, and checked memory settings, and tried to increase
work_mem, but with no success. The issue seems cpu bound (100% cpu load during
query).

* Query

When a user inputs an AS number a simple match on the asn column will return
the stuff relevant. However, the interesting thing to display is
conflicting/rogue routes. This means matching routes with the && operator to
find all covered/covering routes. This would look something like this:

SELECT rv.route, rv.asn, rv.source
FROM routes_view rv
LEFT OUTER JOIN routes_view r ON (rv.route && r.route)
WHERE rv.route && r.route AND r.asn = %s

While this is fairly fast if the initial set of routes is relatively small 
(<100) it runs with a second or so, but if the number of routes matching 
the asn is large (>1000), it takes quite a while (+30 seconds). Explain 
analyze link:

http://explain.depesz.com/s/dHqo

I am not terribly good at reading the output, but it seem most of the time is
actually spend on the bitmap scan for the gist index. It there another type of
indexing that would behave better here?

Since there often identical routes in the initial set of routes (from the AS number matching), I tried
reducing the initial set of matching routes:
ORDER BY rv.route;

SELECT rv.route, rv.asn, rv.source
FROM
     (SELECT DISTINCT route FROM routes_view WHERE asn = %s) r
LEFT OUTER JOIN routes_view rv ON (r.route && rv.route)

This often cuts the set of initial matching routes by 25-50%, and cuts a
similar amount of time from the query time. Explain analyze link:

http://explain.depesz.com/s/kf13

I tried further reducing the routes to the minimal set:

WITH distinct_routes AS (SELECT DISTINCT route FROM routes_view WHERE asn = %s)
SELECT route FROM distinct_routes
EXCEPT
SELECT r1.route
FROM distinct_routes r1
     INNER JOIN distinct_routes r2 ON (r1.route << r2.route);

But typically only yields 10-20% reduction of the inital route set, and 
adds query complexity (putting the above in a CTE/WITH seems to make the 
query significantly slower for some reason).

The main issue seem to be with the gist bitmap index. Is there a better way to
approach this style of query?

     Best regards, Henrik

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Johann Spies | 25 Aug 09:22 2015
Picon

Long running query: How to monitor the progress

Working with 9.4.

We are in the process of unpacking complicated XML-data into tables.  XML-data are already in a table with two fields (id, xml) - 47+ million records.

Some of hour queries to extract the data and insert it in other tables runs for days and in one case we have created a table with 758million unique records.

Now my question. Is there a way to monitor the progress of a long running query like this?

I have recently read that it is probably better for processes like this to copy result of the query to a csv-file and then import it again with copy as an insert. Next time I will try that.

The following query has been running for 6 days now and are still running (I have anonymized it a little bit) on a server with 768 GB RAM.  It has created 44 temporary files so far:

INSERT INTO table_a_link(uid,gn_id)

WITH p AS
  (SELECT ARRAY[ARRAY['t','some_xpath']] AS some_xpath),
         q AS
  (SELECT (xpath('//t:UID/text()',xml,some_xpath))[1] uid,
    unnest(xpath('//t:grant',xml,some_xpath)) AS gr
   FROM source.xml_data a,
        p
        WHERE xpath_exists('//t:grant/t:grant_agency', xml ,some_xpath)),
        r AS
  (
  SELECT
    CASE WHEN
        xpath_exists('//t:grant_ids', gr, some_xpath)
    THEN
        unnest(xpath('//t:grant_ids', gr, some_xpath))
    ELSE
        NULL
    END
        AS GR_ids
   FROM q,
        p ) ,
   y as (SELECT A.UUID AS FO_ID,
/*      unnest(xpath('//t:grant_agency/text()',GR,ns))::citext AS agency,
*/      CASE WHEN
        xpath_exists('//t:grant_id', gr_ids, some_xpath)
    THEN
        unnest(xpath('//t:grant_id/text()', gr_ids, some_xpath))::citext
    ELSE
        NULL
    END
    grant_NO,
    uid::varchar(19)
    from WOS.FUNDING_ORG A, p,q
      left join r on (xpath('//t:grant/t:grant_ids/t:grant_id/text()',gr,
      ARRAY[ARRAY['t','some_xpath']])::citext =
        xpath('//t:grant_id/text()',GR_IDS,ARRAY[ARRAY['t','some_xpath']])::citext)
        WHERE A.FUNDING_ORG = (xpath('//t:grant_agency/text()',GR,some_xpath))[1]::CITEXT
        )

 select distinct y.uid, B.uuid gn_id
 
 from y, table_b B
 where
    y.fo_id = B.fo_id
    and
    y.grant_no is not distinct from b.grant_no


Regards.
Johann
Jeff Janes | 24 Aug 19:04 2015
Picon

Re: query not using GIN index

On Mon, Aug 24, 2015 at 8:18 AM, Guo, Yun <YGuo <at> cvent.com> wrote:


From: Jeff Janes <jeff.janes <at> gmail.com>
Date: Friday, August 21, 2015 at 10:44 PM
To: Yun <yguo <at> cvent.com>
Subject: Re: [PERFORM] query not using GIN index

On Fri, Aug 21, 2015 at 6:55 PM, Guo, Yun <YGuo <at> cvent.com> wrote:
Hi,

We have a query on a column with GIN index, but query plan chooses not using the index but do an seq scan whichi is must slower

CREATE INDEX idx_access_grants_on_access_tokens ON access_grants USING gin (access_tokens); 

explain analyze SELECT "access_grants".* FROM "access_grants"  WHERE (access_tokens <at> > ARRAY['124e5a1f9de325fc176a7c89152ac734']) ;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..7.46 rows=1 width=157) (actual time=260.376..260.377 rows=1 loops=1)
   ->  Seq Scan on access_grants  (cost=0.00..29718.03 rows=3985 width=157) (actual time=260.373..260.373 rows=1 loops=1)
         Filter: (access_tokens <at> > '{124e5a1f9de325fc176a7c89152ac734}'::text[])
         Rows Removed by Filter: 796818
 Total runtime: 260.408 ms


What version are you running?  What are your non-default configuration settings (particularly for the *_cost parameters)?

 select name,setting from pg_settings where name like '%cost';
         name         | setting
----------------------+---------
 cpu_index_tuple_cost | 0.005
 cpu_operator_cost    | 0.0025
 cpu_tuple_cost       | 0.01
 random_page_cost     | 4
 seq_page_cost        | 1


OK, thanks.  I had overlooked the "LIMIT" in the first plan you posted, and so thought you must have some pretty weird settings.  But noticing the LIMIT, it makes more sense with the normal settings, like the ones you show.
 

Can you turn track_io_timing on and then report a explain (analyze, buffers) of the same query? 

I didn’t try this as our prod instance is on AWS and setting this would require a reboot.

OK, but you can still do an "explain (analyze,buffers)".  It is less useful than with track_io_timing on, but it is still more useful than just "explain analyze".
 

Then do a "set enable_seqscan=off" and repeat.

This is the life saver!  After applying this, it’s able to use the index.  But should we consider it as the permanent solution?

No, probably not a permanent solution.  Or at least, I only do things like that in production as a last resort.  I suggested doing that so you can force it to use the index and so see what the explain (analyze,buffers) look like when it does use the index.  Sorry for not being more clear.

The seq scan thinks it is going to find a matching row pretty early in the scan and can stop at the first one, but based on "Rows Removed by Filter: 796818" it isn't actually finding a match until the end.  There probably isn't much you can do about this, other than not using a LIMIT.

The reason it thinks it will find a row soon is that it thinks 0.5% of the rows meet your criteria.  That is default selectivity estimate it uses when it has nothing better to use.  Raising the statistics target on the column might help.  But I doubt it, because access tokens are probably nearly unique, and so even the highest possible setting for statistics target is not going get it to record MCE statistics.  See https://commitfest.postgresql.org/6/323/ for a possible solution, but any fix for that won't be released to production for a long time.


If your gin index has a large pending list, that will make the index scan look very expensive.  vacuuming the table will clear that up.  Setting fastupdate off for the index will prevent it growing again.  Based on your description of most lists having 0 or 1 element in them, and my assumption that a table named "access_grants" isn't getting updated hundreds of times a second, I don't think fast_update being off is going to cause any problems at all.

Cheers,

Jeff
bhuvan Mitra | 24 Aug 09:04 2015
Picon

problem with select *

Hello,

I have a table with 12 columns and 20 Million rows. While writing the table I do not find any problem but when reading that I have some issues faced. When I perform a 'select * from table limit 14000000;' (selecting 14million rows), it is working fine. If the limit value is 15000000, it is throwing the error as 'out of memory'.

If the query is 'select * from table' , The process is getting killed by displaying the message 'killed'.

Kindly tell me where it is going wrong. I have 6MB cache, 1.6GHz CPU, linux 14.04 OS, 8GB RAM.

Thanks.
Guo, Yun | 22 Aug 03:55 2015

query not using GIN index

Hi,

We have a query on a column with GIN index, but query plan chooses not using the index but do an seq scan whichi is must slower

CREATE INDEX idx_access_grants_on_access_tokens ON access_grants USING gin (access_tokens); 

explain analyze SELECT "access_grants".* FROM "access_grants"  WHERE (access_tokens <at> > ARRAY['124e5a1f9de325fc176a7c89152ac734']) ;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..7.46 rows=1 width=157) (actual time=260.376..260.377 rows=1 loops=1)
   ->  Seq Scan on access_grants  (cost=0.00..29718.03 rows=3985 width=157) (actual time=260.373..260.373 rows=1 loops=1)
         Filter: (access_tokens <at> > '{124e5a1f9de325fc176a7c89152ac734}'::text[])
         Rows Removed by Filter: 796818
 Total runtime: 260.408 ms


We tested on smaller table in development region and it chooses to use the index there. However, in production size table it decides to ignore the index for unknown reasons.

Is the large number of tuples skewing the query planner’s decision or the index itself is larger than the table therefor it would decide to do table scan?

Any suggestions are greatly appreciated!

Yun
Genc, Ömer | 21 Aug 14:48 2015
Picon

Performance bottleneck due to array manipulation

Hey,

 

i have a very long running stored procedure, due to array manipulation in a stored procedure. The following procedure takes 13 seconds to finish.

 

BEGIN

    point_ids_older_than_one_hour := '{}';

    object_ids_to_be_invalidated := '{}';

 

    select ARRAY(SELECT

                    point_id

                from ONLY

                    public.ims_point as p

                where

                    p.timestamp < m_before_one_hour

                )

    into point_ids_older_than_one_hour ; -- this array has a size of 20k

 

    select ARRAY(SELECT

                        object_id

                  from

                        public.ims_object_header h

                  WHERE

                        h.last_point_id= ANY(point_ids_older_than_one_hour)

                 )

    into object_ids_to_be_invalidated; -- this array has a size of 100

 

    --    current_last_point_ids will have a size of 100k

    current_last_point_ids := ARRAY( SELECT

                                            last_point_id

                                      from

                                            public.ims_object_header h

                                     );                                    

    -- START OF PERFORMANCE BOTTLENECK

    IF(array_length(current_last_point_ids, 1) > 0)

    THEN   

        FOR i IN 0 .. array_upper(current_last_point_ids, 1)

        LOOP

            point_ids_older_than_one_hour = array_remove(point_ids_older_than_one_hour, current_last_point_ids[i]::bigint);

        END LOOP;

    END IF;

    -- END OF PERFORMANCE BOTTLENECK

END;

 

The array manipulation part is the performance bottleneck. I am pretty sure, that there is a better way of doing this, however I couldn’t find one.

What I have is two table, lets call them ims_point and ims_object_header. ims_object_header references some entries of ims_point in the column last_point_id.

Now I want to delete all entries from ims_point, where the timestamp is older than one hour. The currently being referenced ids of the table ims_object_header should be excluded from this deletion. Therefore I stored the ids in arrays and iterate over those arrays to exclude the referenced values from being deleted.

 

However, I not sure if using an array for an operation like this is the best approach.

 

Can anyone give me some advice how this could be enhanced.

 

Thanks in advance.

 

Stephane Bailliez | 21 Aug 02:03 2015
Picon

Most efficient way of querying M 'related' tables where N out of M may contain the key

Pretty bad subject description... but let me try to explain. 


I'm trying to figure out what would be the most efficient way to query data from multiple tables using a foreign key.

Right now the table design is such that I have multiple tables that share some common information, and some specific information. (in OO world we could see them as derived tables) For the sake of simplicity let's assume there are only 5, 

table type1(int id, varchar(24) reference_id, ....specific columns)
table type2(int id, varchar(24) reference_id, ....specific columns)
table type3(int id, varchar(24) reference_id, ....specific columns)
table type4(int id, varchar(24) reference_id, ....specific columns)
table type5(int id, varchar(24) reference_id, ....specific columns)

NB: you could imagine those 5 tables inheriting from a base_type table that shares a few attributes.

I have a list of reference ids, those reference ids could be in any of those 5 tables but I don't know in which one.

I want to most efficiently retrieve the data on N out of 5 relevant tables but still want to query individually those 5 tables (for orm simplicity reason). 

So the idea is first to identify which tables I should query for.

The naive implementation would be to always query those 5 tables for the list of reference ids, however 90% of the time the reference ids would only be stored in one single table though. So 4/5th of the queries would then be irrelevant.

what I initially came up with was doing a union of the tables such as:


SELECT 'type1', id FROM type5 WHERE reference_id IN (....)
UNION
SELECT 'type2', id FROM type4 WHERE reference_id IN (....)
UNION
...
SELECT 'type2', id FROM type3 WHERE reference_id IN (....)

then effectively figuring the list of which reference ids are in type1, type2, type3, ...etc..

and then issuing the right select to the tables for the related reference ids.

which means in best case scenario I would only do 2 queries instead of 5.
1 to retrieve the list of reference ids per 'type'
1 to retrieve the list of types with the corresponding reference ids

I'm trying to figure out if there is a more efficient way to retrieve this information than doing a union across all tables (there can be a couple hundreds reference ids to query for in the list)

I was thinking worse case scenario I could maintain this information in another table via triggers to avoid doing this union, but that seems a bit of a hammer solution initially and wondering if there is not something simpler via joins that could be more performant.

Thanks for any suggestions.




Josh Berkus | 14 Aug 18:34 2015

Re: Strange query stalls on replica in 9.3.9

On 08/13/2015 01:24 PM, Kevin Grittner wrote:
>> Thing is, the update all rows only takes 2.5 seconds to execute on the
>> master. So even if the update is blocking the seq scans on the replica
>> (and I can't see why it would), it should only block them for < 3 seconds.
> 
> Visibility hinting and/or hot pruning?

Unlikely; I can VACUUM FULL the entire database in 30 seconds.  This
database is small.  Jeff's answer seems more likely ...

On 08/13/2015 01:59 PM, Jeff Janes wrote: execute on the

> Once the commit of the whole-table update has replayed, the problem
> should go way instantly because at that point each backend doing the
> seqscan will find the the transaction has committed and so will set the
> hint bit that means all of the other seqscan backends that come after it
> can skip the proc array scan for that tuple.

Yes ... and given that the commit on the master took < 3 seconds, it's
not likely to take 30 seconds on the replica.  That aside, the pattern
of behavior does look similar to the planner issue.

> So perhaps the commit of the whole-table update is delayed because the
> startup process as also getting bogged down on the same contended lock?
> I don't know how hard WAL replay hits the proc array lock.

I don't know; we don't have any visibility into the replay process, and
no way to tell if replay is waiting on some kind of lock.  A regular
UPDATE should not block against any select activity on the replay, though.

Also, why would this affect *only* the query which does seq scans?  Is
there some difference between seqscan and index scan here, or is it
simply because they take longer, and since this issue is timing-based,
they're more likely to be hit?  Significantly, the seqscan query is also
the most complex query run against the replica, so maybe the seqscan is
irrelevant and it's being affected by planner issues?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

林士博 | 14 Aug 08:09 2015
Picon

Re: Slow Query

OK.

If you benchmark that correctly, 
then it seems that adding some redundant search can get the query faster in some special cases.

And without further info, I can not see any reason.

2015-08-14 14:35 GMT+09:00 Robert Campbell <robcampbell73 <at> gmail.com>:
Hi,

My mistake, didnt apply the sub query properly the first time.

It does return records but not quite as fast as original query, about 200ms slower

Vacancy ID is a primary key.

On Fri, Aug 14, 2015 at 3:10 PM, 林士博 <lin <at> repica.co.jp> wrote:
Is the "Vacancy"."ID" a primary key?
Or is unique in Vacancy table?




--



Josh Berkus | 13 Aug 19:09 2015

Strange query stalls on replica in 9.3.9

Setup:

* PostgreSQL 9.3.9
* 1 master, 1 replica
* Tiny database, under 0.5GB, completely cached in shared_buffers
* 90% read query traffic, which is handled by replica
* Traffic in the 1000's QPS.

The wierdness:

Periodically the master runs an "update all rows" query on the main
table in the database.  When this update hits the replica via
replication stream, *some* (about 5%) of the queries which do seq scans
will stall for 22 to 32 seconds (these queries normally take about
75ms).  Queries which do index scans seem not to be affected.

Thing is, the update all rows only takes 2.5 seconds to execute on the
master. So even if the update is blocking the seq scans on the replica
(and I can't see why it would), it should only block them for < 3 seconds.

Anyone seen anything like this?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Gmane