Christian Weyer | 31 Jan 17:00 2015

Unexpected (bad) performance when querying indexed JSONB column

Hi all,

The pg version in question is the latest 9.4., running on Windows.

For testing out the NoSQL features of pg I have a simple table called ‘articles’ with a column called ‘data’.

There is an index on ‘data’ like this:
  CREATE INDEX idx_data ON articles USING gin (data jsonb_path_ops);

The current test data set has 32570 entires of JSON docs like this:
{
"title": "Foo Bar",
"locked": true, 
"valid_until": "2049-12-31T00:00:00", 
"art_number": 12345678, 
"valid_since": "2013-10-05T00:00:00", 
"number_valid": false, 
"combinations": {
"var1": "4711", 
"var2": "4711", 
"var3": "0815", 
"int_art_number": "000001"
}
}

Nothing too complex, I think.

When I run a simple query:
  SELECT data  #>> ‘{"title"}' 
  FROM articles
  WHERE data <at> > '{ “locked" : true }';

Reproducingly, it takes approx. 900ms to get the results back.
Honestly, I was expecting a much faster query.

Any opinions on this?

Thanks,
-C.

Slava Mudry | 31 Jan 00:44 2015
Picon

why pg_class.relfrozenxid needs to be updated for frozen tables (where all rows have xmin=2)?

Hi,
I have to deal with badly written system which regularly suffers from transaction wraparound issue. This issue is happenning every 10-14 days and forces me to take system offline and vacuum in single-user mode.
Main causes for this are (afaik):
1) heavy transaction traffic + 100+GB of stale tables
2) slow i/o (rotational drives)
3) autovacuum can't keep up.

Basically the database stores events data in daily partitioned table "daily_events".
What I did, was - I ran vaccum freeze on all partitions (the tables are never touched after they're done for a day). I have also scheduled vacuum-freeze for a partition after it's done writing.

This essentially set xmin in each partition to "frozen" value of "2".
However, to my surprise, this was not enough!
Postgres stores relfrozenxid in pg_class and this value apparently is getting old pretty fast (due to high volume of transactions).
And it seems that it doesn't really matter that xmin is frozen for a table, the relfrozenxid is what causing transaction wraparound.

Why is that? and most importantly - why updating pg_class.relfrozenxid requires huge amount of i/o by vacuum process for tables that are never updated?

Is it safe to just update pg_class.relfrozenxid for tables where xmin=2 for all rows? Same for linked toast table?
Thank you.

Josh Berkus | 30 Jan 21:26 2015

Re: working around JSONB's lack of stats?

On 01/28/2015 03:50 PM, Peter Geoghegan wrote:
> On Wed, Jan 28, 2015 at 3:42 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
>> jsonb_col  <at> > '[ "key1" ]'
>> or jsonb_col ? 'key1'
>>         if in MCE, assign % from MCE
>>         otherwise assign 1% of non-MCE %
>>
>> jsonb_col  <at> > '{ "key1": "value1" }'
>>         if in MCE, assign MCE% * 0.1
>>         otherwise assign 0.01 of non-MCE %
>>
>> Does that make sense?
> 
> I suspect it makes a lot less sense. The way people seem to want to
> use jsonb is as a document store with a bit of flexibility. Individual
> JSON documents tend to be fairly homogeneous in structure within a
> table, just like with systems like MongoDB. Strings within arrays are
> keys for our purposes, and these are often used for tags and so on.
> But Strings that are the key of an object/pair are much less useful to
> index, in my estimation.

Yeah, I see your point; except for arrays, people are usually searching
for a key:value pair, and the existence of the key is not in doubt.

That would make the "element" the key:value pair, no?  But
realistically, we would only want to do that for simple keys and values.

Although: if you "flatten" a nested JSON structure into just keys with
scalar values (and array items as their own thing), then you could have
a series of expanded key:value pairs to put into MCE.

For example:

{ house : { city : San Francisco,
     sqft: 1200,
     color: blue,
     occupants: [ mom, dad, child1 ]
     }
  occupation: programmer
}

... would get flattened out into the following pairs:

city: san francisco
sqft: 1200
color: blue
occupants: [ mom ]
occupants: [ dad ]
occupants: [ child1 ]
occupation: programmer

This would probably work because there aren't a lot of data structures
where people would have the same key:value pair in different locations
in the JSON, and care about it stats-wise.  Alternatetly, if the same
key-value pair appears multiple times in the same sample row, we could
cut the MC% by that multiple.

No?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Josh Berkus | 29 Jan 00:42 2015

Re: working around JSONB's lack of stats?

On 01/28/2015 03:34 PM, Peter Geoghegan wrote:
> On Wed, Jan 28, 2015 at 3:03 PM, Josh Berkus <josh <at> agliodbs.com> wrote:
>> We already have most_common_elem (MCE) for arrays and tsearch.  What if
>> we put JSONB's most common top-level keys (or array elements, depending)
>> in the MCE array?  Then we could still apply a simple rule for any path
>> criteria below the top-level keys, say assuming that any sub-key
>> criteria would match 10% of the time.  While it wouldn't be perfect, it
>> would be better than what we have now.
> 
> Well, the "top-level keys" would still be gathered for expression
> indexes. So yeah, maybe it would work alright for arrays of "tags",
> and things like that. I tend to think that that's a common enough
> use-case.

Yah, and even for cases where people have nested structures, currently
we require  <at> > to start at the top.  So we can at least compare top-level
keys to see if the key returned is in the MCEs or not, and take action
accordingly.

We could start with a constant for anything below the key, where we
assume that all values show up 10% of the time.

thus:

jsonb_col  <at> > '[ "key1" ]'
or jsonb_col ? 'key1'
	if in MCE, assign % from MCE
	otherwise assign 1% of non-MCE %

jsonb_col  <at> > '{ "key1": "value1" }'
	if in MCE, assign MCE% * 0.1
	otherwise assign 0.01 of non-MCE %

Does that make sense?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Josh Berkus | 29 Jan 00:03 2015

Re: working around JSONB's lack of stats?

On 01/28/2015 11:48 AM, Tomas Vondra wrote:
> On 27.1.2015 08:06, Josh Berkus wrote:
>> Folks,
>>
> ...
>>
>> On a normal column, I'd raise n_distinct to reflect the higher
>> selecivity of the search terms.  However, since  <at> > uses contsel,
>> n_distinct is ignored.  Anyone know a clever workaround I don't
>> currently see?
> 
> I don't see any reasonable workaround :-(
> 
> ISTM we'll have to invent a way to collect useful stats about contents
> of JSON/JSONB documents. JSONB is cool, but at the moment we're mostly
> relying on defaults that may be reasonable, but still misfire in many
> cases. Do we have any ideas of how that might work?
> 
> We're already collecting stats about contents of arrays, and maybe we
> could do something similar for JSONB? The nested nature of JSON makes
> that rather incompatible with the flat MCV/histogram stats, though.

Well, I was thinking about this.

We already have most_common_elem (MCE) for arrays and tsearch.  What if
we put JSONB's most common top-level keys (or array elements, depending)
in the MCE array?  Then we could still apply a simple rule for any path
criteria below the top-level keys, say assuming that any sub-key
criteria would match 10% of the time.  While it wouldn't be perfect, it
would be better than what we have now.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Josh Berkus | 27 Jan 08:06 2015

working around JSONB's lack of stats?

Folks,

Currently, JSONB fields don't have statistics, and estimate a flat 1%
selectivity.  This can result in poor query plans, and I'm wondering if
anyone has a suggested workaround for this short of hacking a new
selectivity function.  For example, take the common case of using JSONB
to hold a list of "tags" for tagging documents:

 Table "public.doc_tags_json"
 Column |  Type   | Modifiers
--------+---------+-----------
 doc_id | integer |
 tags   | jsonb   |
Indexes:
    "doc_tags_json_doc_id_idx" UNIQUE, btree (doc_id)
    "doc_tags_json_tags_idx" gin (tags)

This query:

select doc_id
from doc_tags_json
where tags  <at> > '[ "math", "physics" ]'
order by doc_id desc limit 25;

Uses this plan:

     QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..709.79 rows=25 width=4) (actual time=24.529..340.499
rows=25 loops=1)
   ->  Index Scan Backward using doc_tags_json_doc_id_idx on
doc_tags_json  (cost=0.43..283740.95 rows=10000 width=4) (actual
time=24.528..340.483 rows=25 loops=1)
         Filter: (tags  <at> > '["math", "physics"]'::jsonb)
         Rows Removed by Filter: 1011878
 Planning time: 0.090 ms
 Execution time: 340.528 ms

It does this because it expects  <at> > '["math", "physics"]' to match 10,000
rows, which means that it expects to only scan 25,000 entries in the
doc_id index to return the top 25.  However, the matching condition is
much rarer than it thinks, so it's actually far faster to use the index
on the JSONB column:

drop index doc_tags_json_doc_id_idx;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10517.08..10517.14 rows=25 width=4) (actual
time=7.594..7.602 rows=25 loops=1)
   ->  Sort  (cost=10517.08..10542.08 rows=10000 width=4) (actual
time=7.593..7.596 rows=25 loops=1)
         Sort Key: doc_id
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Bitmap Heap Scan on doc_tags_json  (cost=92.90..10234.89
rows=10000 width=4) (actual time=6.733..7.475 rows=257 loops=1)
               Recheck Cond: (tags  <at> > '["math", "physics"]'::jsonb)
               Heap Blocks: exact=256
               ->  Bitmap Index Scan on doc_tags_json_tags_idx
(cost=0.00..90.40 rows=10000 width=0) (actual time=6.695..6.695 rows=257
loops=1)
                     Index Cond: (tags  <at> > '["math", "physics"]'::jsonb)
 Planning time: 0.093 ms
 Execution time: 7.632 ms

On a normal column, I'd raise n_distinct to reflect the higher
selecivity of the search terms.  However, since  <at> > uses contsel,
n_distinct is ignored.  Anyone know a clever workaround I don't
currently see?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Christian Roche | 26 Jan 17:32 2015

Why is PostgreSQL not using my index?

Hi guys,

 

Can I take a jab at the celebrated “why is Postgres not using my index” riddle?

 

I’m using PostgreSQL 9.3.3 on an Amazon RDS “db.r3.xlarge” 64-bit instance. I have two tables, one with about 30M rows and two indexes (in fact a monthly partition):

 

CREATE TABLE staging.mixpanel_events_201409 (

  date_day          date NOT NULL,

  event_id          int NOT NULL REFERENCES mixpanel_event_list,

  prop              hstore

);

 

CREATE INDEX mixpanel_idx_date_201409

  ON mixpanel_events_201409

  USING btree

  (date_day);

 

CREATE INDEX mixpanel_idx_event_201409

  ON mixpanel_events_201409

  USING btree

  (event_id);

 

 

And a lookup table with about 600 rows:

 

CREATE TABLE staging.mixpanel_event_list (

  id                serial PRIMARY KEY,

  name              text UNIQUE,

  source            event_source NULL

);

 

 

Now when I select a subset of the possible event IDs in the big table, PG uses the appropriate index:

 

select *

  from mixpanel_events_201409

 where event_id in (3, 4, 5, 6, 7, 8, 9, 10, 11, 373, 375, 376, 318);

 

 

Bitmap Heap Scan on mixpanel_events_201409  (cost=7663.36..1102862.70 rows=410022 width=949)

  Recheck Cond: (event_id = ANY ('{3,4,5,6,7,8,9,10,11,373,375,376,318}'::integer[]))

  ->  Bitmap Index Scan on mixpanel_idx_event_201409  (cost=0.00..7560.85 rows=410022 width=0)

        Index Cond: (event_id = ANY ('{3,4,5,6,7,8,9,10,11,373,375,376,318}'::integer[]))

 

 

But when I try to join the lookup table and select from it, the index is dismissed for a full table scan with a catastrophic effect on performance:

 

select *

from mixpanel_events_201409 mp

  inner join mixpanel_event_list ev on ( ev.id = mp.event_id )

where ev.id in (3, 4, 5, 6, 7, 8, 9, 10, 11, 373, 375, 376, 318);

 

Hash Join  (cost=20.73..2892183.32 rows=487288 width=1000)

  Hash Cond: (mp.event_id = ev.id)

  ->  Seq Scan on mixpanel_events_201409 mp  (cost=0.00..2809276.70 rows=20803470 width=949)

  ->  Hash  (cost=20.57..20.57 rows=13 width=51)

        ->  Seq Scan on mixpanel_event_list ev  (cost=0.00..20.57 rows=13 width=51)

              Filter: (id = ANY ('{3,4,5,6,7,8,9,10,11,373,375,376,318}'::integer[]))

 

 

Both tables have been vacuum analyzed.

 

What gives?

 

Thanks a lot for your help,

Chris

 

This email is from Workshare Limited. The information contained in and accompanying this communication may be confidential, subject to legal privilege, or otherwise protected from disclosure, and is intended solely for the use of the intended recipient(s). If you are not the intended recipient of this communication, please delete and destroy all copies in your possession and note that any review or dissemination of, or the taking of any action in reliance on, this communication is expressly prohibited. Please contact the sender if you believe you have received this email in error. Workshare Limited is a limited liability company registered in England and Wales (registered number 3559880), its registered office is at 20 Fashion Street, London, E1 6PX for further information, please refer to http://www.workshare.com.

Joe Van Dyk | 25 Jan 06:41 2015

Query performance

I have an events table that records page views and purchases (type = 'viewed' or type='purchased'). I have a query that figures out "people who bought/viewed this also bought/viewed that".

It worked fine, taking about 0.1 seconds to complete, until a few hours ago when it started taking hours to complete. Vacuum/analyze didn't help.  Turned out there was one session_id that had 400k rows in the system. Deleting that made the query performant again. 

Is there anything I can do to make the query work better in cases like that? Missing index, or better query?

This is on 9.3.5.

The below is reproduced at the following URL if it's not formatted correctly in the email. https://gist.githubusercontent.com/joevandyk/cb8f4afdb6c1b178c606/raw/9940bbe033ebd56d38caa46e33c1ddfd9df36eda/gistfile1.txt

explain select e1.product_id, e2.site_id, e2.product_id, count(nullif(e2.type='viewed', false)) view_count, count(nullif(e2.type='purchased', false)) purchase_count from events e1 join events e2 on e1.session_id = e2.session_id and e1.type = e2.type where e1.product_id = '82503' and e1.product_id != e2.product_id group by e1.product_id, e2.product_id, e2.site_id; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=828395.67..945838.90 rows=22110 width=19) -> Sort (cost=828395.67..840117.89 rows=4688885 width=19) Sort Key: e1.product_id, e2.product_id, e2.site_id -> Nested Loop (cost=11.85..20371.14 rows=4688885 width=19) -> Bitmap Heap Scan on events e1 (cost=11.29..1404.31 rows=369 width=49) Recheck Cond: (product_id = '82503'::citext) -> Bitmap Index Scan on events_product_id_site_id_idx (cost=0.00..11.20 rows=369 width=0) Index Cond: (product_id = '82503'::citext) -> Index Scan using events_session_id_type_product_id_idx on events e2 (cost=0.56..51.28 rows=12 width=51) Index Cond: ((session_id = e1.session_id) AND (type = e1.type)) Filter: (e1.product_id <> product_id) (11 rows) recommender_production=> \d events Table "public.events" Column | Type | Modifiers -------------+--------------------------+----------------------------------------------------- id | bigint | not null default nextval('events_id_seq'::regclass) user_id | citext | session_id | citext | not null product_id | citext | not null site_id | citext | not null type | text | not null happened_at | timestamp with time zone | not null created_at | timestamp with time zone | not null Indexes: "events_pkey" PRIMARY KEY, btree (id) "events_product_id_site_id_idx" btree (product_id, site_id) "events_session_id_type_product_id_idx" btree (session_id, type, product_id) Check constraints: "events_session_id_check" CHECK (length(session_id::text) < 255) "events_type_check" CHECK (type = ANY (ARRAY['purchased'::text, 'viewed'::text])) "events_user_id_check" CHECK (length(user_id::text) < 255)

AlexK987 | 25 Jan 00:33 2015
Picon

How to tell ANALYZE to collect statistics from the whole table?

The documentation states that "The extent of analysis can be controlled by
adjusting the default_statistics_target configuration variable". It looks
like I can tell Postgres to create more histograms with more bins, and more
distinct values. This implicitly means that Postgres will use a larger
random subset to calculate statistics. 

However, this is not what I want. My data may be quite skewed, and I want
full control over the size of the sample. I want to explicitly tell Postgres
to analyze the whole table. How can I accomplish that?

--
View this message in context: http://postgresql.nabble.com/How-to-tell-ANALYZE-to-collect-statistics-from-the-whole-table-tp5835339.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Laurent Cathala | 22 Jan 17:46 2015

Initial insert

Hi, 
I'm trying to create datas on an initial import and i'm encountering a performance issue.
I've 2 tables, my process create a record in each table and execute a sum with join on this 2 tables. (and other requests but there are very fast)

My 2 tables are empty before the import.

My count query is :
select sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and caracteris1_.id_article='4028804c4a311178014a346546967c59'

i use parameterized request.

My process create only 6000 records in each table.

During the whole process this sum request lasts longer and longer.

The auto-explain plan show an seq scan 

----------
  Query Text: select sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and caracteris1_.id_article=$1
        Aggregate  (cost=2.04..2.05 rows=1 width=26) (actual time=862.621..862.621 rows=1 loops=1)
          Output: sum(quantitest0_.quantite_valeur)
          ->  Nested Loop  (cost=0.00..2.04 rows=1 width=26) (actual time=862.618..862.618 rows=0 loops=1)
                Output: quantitest0_.quantite_valeur
                Join Filter: ((quantitest0_.id_caracteristiquearticlestock)::text = (caracteris1_.id)::text)
                Rows Removed by Join Filter: 1869
                ->  Seq Scan on public.dm5_quantitestock quantitest0_  (cost=0.00..1.01 rows=1 width=164) (actual time=0.004..0.408 rows=1869 loops=1)
                      Output: quantitest0_.id, quantitest0_.datefinvalidite, quantitest0_.quantite_valeur, quantitest0_.id_caracteristiquearticlestock, quantitest0_.id_caracteristiquelieustock, quantitest0_.datecreationsysteme, quantitest0_.datemodificationsysteme, quantitest0_.id_creeparsysteme, quantitest0_.id_modifieparsysteme
                ->  Seq Scan on public.dm5_caracteristiquearticlestock caracteris1_  (cost=0.00..1.01 rows=1 width=42) (actual time=0.456..0.456 rows=1 loops=1869)
                      Output: caracteris1_.id, caracteris1_.datefinvalidite, caracteris1_.id_lot, caracteris1_.id_article, caracteris1_.id_numeroserie, caracteris1_.datecreationsysteme, caracteris1_.datemodificationsysteme, caracteris1_.id_modifieparsysteme, caracteris1_.id_creeparsysteme
                      Filter: ((caracteris1_.id_article)::text = ($1)::text)
                      Rows Removed by Filter: 1869
-----------

if a launch an analyse during the process, the explain use index, but the time remains the same.

---------
Query Text: select sum(quantitest0_.quantite_valeur) as col_0_0_ from dm5_quantitestock quantitest0_, dm5_caracteristiquearticlestock caracteris1_ where quantitest0_.id_caracteristiquearticlestock=caracteris1_.id and caracteris1_.id_article=$1
Aggregate  (cost=16.55..16.56 rows=1 width=26) (actual time=654.998..654.998 rows=1 loops=1)
 Output: sum(quantitest0_.quantite_valeur)
 ->  Nested Loop  (cost=0.00..16.55 rows=1 width=26) (actual time=654.994..654.994 rows=0 loops=1)
       Output: quantitest0_.quantite_valeur
       Join Filter: ((quantitest0_.id_caracteristiquearticlestock)::text = (caracteris1_.id)::text)
       Rows Removed by Join Filter: 1651
       ->  Index Scan using x_dm5_quantitestock_00 on public.dm5_quantitestock quantitest0_  (cost=0.00..8.27 rows=1 width=164) (actual time=0.011..0.579 rows=1651 loops=1)
             Output: quantitest0_.id, quantitest0_.datefinvalidite, quantitest0_.quantite_valeur, quantitest0_.id_caracteristiquearticlestock, quantitest0_.id_caracteristiquelieustock, quantitest0_.datecreationsysteme, quantitest0_.datemodificationsysteme, quantitest0_.id_creeparsysteme, quantitest0_.id_modifieparsysteme
       ->  Index Scan using dm5_caracteristiquearticlestock_pkey on public.dm5_caracteristiquearticlestock caracteris1_  (cost=0.00..8.27 rows=1 width=42) (actual time=0.395..0.395 rows=1 loops=1651)
             Output: caracteris1_.id, caracteris1_.datefinvalidite, caracteris1_.id_lot, caracteris1_.id_article, caracteris1_.id_numeroserie, caracteris1_.datecreationsysteme, caracteris1_.datemodificationsysteme, caracteris1_.id_modifieparsysteme, caracteris1_.id_creeparsysteme
             Filter: ((caracteris1_.id_article)::text = '4028804c4a311178014a346547307cce'::text)
             Rows Removed by Filter: 1651

----------

If i create the first 1000 records, commit and end transaction, the whole import is very fast.


I can't change my process to cut the process in little part...

An idea ? 

Thanks.

Laurent CATHALA
Architecte
lca <at> sylob.com


7 rue Marcel Dassault - Z.A. La Mouline - 81990 Cambon d'Albi - FRANCE
Tel : 05 63 53 08 18 - Fax : 05 63 53 07 42 - www.sylob.com
Support : 05 63 53 78 35 - support <at> sylob.com
Entreprise certifiée ISO 9001 version 2008 par Bureau Veritas.



   

Josh Berkus | 16 Jan 05:15 2015

Re: Strange choice of general index over partial index


> Right, I suspect that bloating is possibly the significant factor then -
> can you REINDEX?

Believe me, it's on the agenda.  Of course, this is on a server with 90%
saturated IO, so doing a repack is going to take some finessing.

BTW, effective_cache_size is set to 100GB.  So I suspect that it's the
other issue with Tom mentioned, which is that 9.2 really doesn't take
physical index size into account.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Gmane