Bruno Wolff III | 3 Oct 2006 02:01
Picon

Re: BUG #2658: Query not using index

This shouldn't have been submitted to the bugs list, as it isn't a bug.
The best spot for this kind of question is the performance list so I am
copying it there and redirecting followups there.

On Wed, Sep 27, 2006 at 20:56:32 +0000,
  Graham Davis <gdavis <at> refractions.net> wrote:
> 
> SELECT assetid, max(ts) AS ts 
> FROM asset_positions 
> GROUP BY assetid;
> 
> I have an index on (ts), another index on (assetid) and a multikey index on
> (assetid, ts).  I know the assetid index is pointless since the multikey one
> takes its place, but I put it there while testing just to make sure.  The
> ANALYZE EXPLAIN for this query is:
> 
>                                                              QUERY PLAN
> ----------------------------------------------------------------------------
> -------------------------------------------------------------
>  HashAggregate  (cost=125423.96..125424.21 rows=20 width=12) (actual
> time=39693.995..39694.036 rows=20 loops=1)
>    ->  Seq Scan on asset_positions  (cost=0.00..116654.64 rows=1753864
> width=12) (actual time=20002.362..34724.896 rows=1738693 loops=1)
>  Total runtime: 39694.245 ms
> (3 rows)
> 
> You can see it is doing a sequential scan on the table when it should be
> using the (assetid, ts) index, or at the very least the (ts) index.  This
> query takes about 40 seconds to complete with a table of 1.7 million rows. 
> I tested running the query without the group by as follows:
(Continue reading)

Tom Lane | 12 Oct 2006 17:42
Picon

Re: [PERFORM] Hints proposal

[ This is off-topic for -performance, please continue the thread in
-hackers ]

"Jim C. Nasby" <jim <at> nasby.net> writes:
> These hints would outright force the planner to do things a certain way.
> ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */

This proposal seems to deliberately ignore every point that has been
made *against* doing things that way.  It doesn't separate the hints
from the queries, it doesn't focus on fixing the statistical or cost
misestimates that are at the heart of the issue, and it takes no account
of the problem of hints being obsoleted by system improvements.

> It would also be useful to allow tweaking of planner cost estimates.
> This would take the general form of
> node operator value

This is at least focusing on the right sort of thing, although I still
find it completely misguided to be attaching hints like this to
individual queries.

What I would like to see is information *stored in a system catalog*
that affects the planner's cost estimates.  As an example, the DBA might
know that a particular table is touched sufficiently often that it's
likely to remain RAM-resident, in which case reducing the page fetch
cost estimates for just that table would make sense.  (BTW, this is
something the planner could in principle know, but we're unlikely to
do it anytime soon, for a number of reasons including a desire for plan
stability.)  The other general category of thing I think we need is a
way to override selectivity estimates for particular forms of WHERE
(Continue reading)

Merlin Moncure | 12 Oct 2006 18:22
Picon

Re: [PERFORM] Hints proposal

On 10/12/06, Tom Lane <tgl <at> sss.pgh.pa.us> wrote:
> [ This is off-topic for -performance, please continue the thread in
> -hackers ]

> This proposal seems to deliberately ignore every point that has been
> made *against* doing things that way.  It doesn't separate the hints
> from the queries, it doesn't focus on fixing the statistical or cost
> misestimates that are at the heart of the issue, and it takes no account
> of the problem of hints being obsoleted by system improvements.

what about extending the domain system so that we can put in ranges
that override the statistics or (imo much more importantly) provide
information when the planner would have to restort to a guess. my case
for this is prepared statements with a parameterized limit clause.

prepare foo(l int) as select * from bar limit $1;

maybe:
create domain foo_lmt as int hint 1; -- probably needs to be fleshed out
prepare foo(l foolmt) as select * from bar limit $1;

this says: "if you have to guess me, please use this"

what I like about this over previous attempts to persuade you is the
grammar changes are localized and also imo future proofed. planner can
ignore the hints if they are not appropriate for the oparation.

merlin

---------------------------(end of broadcast)---------------------------
(Continue reading)

Jim C. Nasby | 12 Oct 2006 18:25

Re: [PERFORM] Hints proposal

On Thu, Oct 12, 2006 at 11:42:32AM -0400, Tom Lane wrote:
> [ This is off-topic for -performance, please continue the thread in
> -hackers ]
> 
> "Jim C. Nasby" <jim <at> nasby.net> writes:
> > These hints would outright force the planner to do things a certain way.
> > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */
> 
> This proposal seems to deliberately ignore every point that has been
> made *against* doing things that way.  It doesn't separate the hints
> from the queries, it doesn't focus on fixing the statistical or cost
> misestimates that are at the heart of the issue, and it takes no account
> of the problem of hints being obsoleted by system improvements.

Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See
also my comment below.

> > It would also be useful to allow tweaking of planner cost estimates.
> > This would take the general form of
> > node operator value
> 
> This is at least focusing on the right sort of thing, although I still
> find it completely misguided to be attaching hints like this to
> individual queries.

Yes, but as I mentioned the idea here was to come up with something that
is (hopefully) easy to define and implement. In other words, something
that should be doable for 8.3. Because this proposal essentially amounts
to limiting plans the planner will consider and tweaking it's cost
estimates, I'm hoping that it should be (relatively) easy to implement.
(Continue reading)

Csaba Nagy | 12 Oct 2006 18:34

Re: [PERFORM] Hints proposal

OK, I just have to comment...

"Jim C. Nasby" <jim <at> nasby.net> writes:
> > These hints would outright force the planner to do things a certain way.
> > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */
> 
> This proposal seems to deliberately ignore every point that has been
> made *against* doing things that way.  It doesn't separate the hints
> from the queries, it doesn't focus on fixing the statistical or cost
> misestimates that are at the heart of the issue, and it takes no account
> of the problem of hints being obsoleted by system improvements.

But whatever arguments you made about planner improvements and the like,
it will NEVER be possible to correctly estimate in all cases the
statistics for a query, even if you perfectly know WHAT statistics you
need, which is also not the case all the time. 

Tom, you're the one who knows best how the planner works... can you bet
anything you care about on the fact that one day the planner will never
ever generate a catastrophic plan without DBA tweaking ? And how far in
time we'll get to that point ?

Until that point is achieved, the above proposal is one of the simplest
to understand for the tweaking DBA, and the fastest to deploy when faced
with catastrophic plans. And I would guess it is one of the simplest to
be implemented and probably not very high maintenance either, although
this is just a guess.

If I could hint some of my queries, I would enable anonymous prepared
statements to take into account the parameter values, but I can't
(Continue reading)

Josh Berkus | 12 Oct 2006 18:40

Re: [PERFORM] Hints proposal

Jim,

>>> These hints would outright force the planner to do things a certain way.
>>> ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */
>> This proposal seems to deliberately ignore every point that has been
>> made *against* doing things that way.  It doesn't separate the hints
>> from the queries, it doesn't focus on fixing the statistical or cost
>> misestimates that are at the heart of the issue, and it takes no account
>> of the problem of hints being obsoleted by system improvements.
>  
> Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See
> also my comment below.

I don't see how adding extra tags to queries is easier to implement than 
an ability to modify the system catalogs.  Quite the opposite, really.

And, as I said, if you're going to push for a feature that will be 
obsolesced in one version, then you're going to have a really rocky row 
to hoe.

> Yes, but as I mentioned the idea here was to come up with something that
> is (hopefully) easy to define and implement. In other words, something
> that should be doable for 8.3. Because this proposal essentially amounts
> to limiting plans the planner will consider and tweaking it's cost
> estimates, I'm hoping that it should be (relatively) easy to implement.

Even I, the chief marketing geek, am more concerned with getting a 
feature that we will still be proud of in 5 years than getting one in 
the next nine months.  Keep your pants on!

(Continue reading)

Josh Berkus | 12 Oct 2006 18:45

Re: [PERFORM] Hints proposal

Csaba,

> I guess the angle is: I, as a practicing DBA would like to be able to
> experiment and get most out of the imperfect tool I have, and you, the
> developers, want to make the tool perfect... I don't care about perfect
> tools, it just have to do the job... hints or anything else, if I can
> make it work GOOD ENOUGH, it's all fine. And hints is something I would
> understand and be able to use.

Hmmm, if you already understand Visual Basic syntax, should we support 
that too?  Or maybe we should support MySQL's use of '0000-00-00' as the 
"zero" date because people "understand" that?

We're just not going to adopt a bad design because Oracle DBAs are used 
to it.   If we wanted to do that, we could shut down the project and 
join a proprietary DB staff.

The current discussion is:

a) Planner tweaking is sometimes necessary;
b) Oracle HINTS are a bad design for planner tweaking;
c) Can we come up with a good design for planner tweaking?

So, how about suggestions for a good design?

--Josh Berkus

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

(Continue reading)

Csaba Nagy | 12 Oct 2006 19:04

Re: [PERFORM] Hints proposal

> Hmmm, if you already understand Visual Basic syntax, should we support 
> that too?  Or maybe we should support MySQL's use of '0000-00-00' as the 
> "zero" date because people "understand" that?

You completely misunderstood me... I have no idea about oracle hints,
never used Oracle in fact. My company uses oracle, but I have only very
very limited contact with oracle issues, and never touched a hint.

I'm only talking about ease of use, learning curves, and complexity in
general. While I do like the idea of an all automatic system optimizer
which takes your query portofolio and analyzes the data based on those
queries and creates you all the indexes you need and all that, that's
not gonna happen soon, because it's a very complex thing to implement.

The alternative is that you take your query portofolio, analyze it
yourself, figure out what statistics you need, create indexes, tweak
queries, hint the planner for correlations and stuff... which is a
complex task, and if you have to tell the server about some correlations
with the phase of the moon, you're screwed cause there will never be any
DB engine which will understand that. 

But you always can put the corresponding hint in the query when you know
the correlation is there...

The problem is that the application sometimes really knows better than
the server, when the correlations are not standard.

> We're just not going to adopt a bad design because Oracle DBAs are used 
> to it.   If we wanted to do that, we could shut down the project and 
> join a proprietary DB staff.
(Continue reading)

Tom Lane | 12 Oct 2006 19:18
Picon

Re: [PERFORM] Hints proposal

Csaba Nagy <nagy <at> ecircle-ag.com> writes:
> Until that point is achieved, the above proposal is one of the simplest
> to understand for the tweaking DBA, and the fastest to deploy when faced
> with catastrophic plans. And I would guess it is one of the simplest to
> be implemented and probably not very high maintenance either, although
> this is just a guess.

That guess is wrong ... but more to the point, if you think that "simple
and easy to implement" should be the overriding concern for designing a
new feature, see mysql.  They've used that design approach for years and
look what a mess they've got.  This project has traditionally done
things differently and I feel no need to change that mindset now.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Andrew Sullivan | 12 Oct 2006 21:03
Picon

Re: [PERFORM] Hints proposal

On Thu, Oct 12, 2006 at 02:21:55PM -0400, Merlin Moncure wrote:
> third way: to solve the problem of data (especially constants) not
> being available to the planner at the time the plan was generated.
> this happens most often with prepared statements and sql udfs.  note
> that changes to the plan generation mechanism (i think proposed by
> peter e a few weeks back) might also solve this.

You're right about this, but you also deliver the reason why we don't
need hints for that: the plan generation mechanism is a better
solution to that problem.  It's this latter thing that I keep coming
back to.  As a user of PostgreSQL, the thing that I really like about
it is its pragmatic emphasis on correctness.  In my experience, it's
a system that feels very UNIX-y: there's a willingness to accept
"80/20" answers to a problem in the event you at least have a way to
get the last 20, but the developers are opposed to anything that
seems really kludgey.

In the case you're talking about, it seems to me that addressing the
problems where they come from is a better solution that trying to
find some way to work around them.  And most of the use-cases I hear
for a statement-level hints system fall into this latter category.

A
--

-- 
Andrew Sullivan  | ajs <at> crankycanuck.ca
Unfortunately reformatting the Internet is a little more painful 
than reformatting your hard drive when it gets out of whack.
		--Scott Morris

---------------------------(end of broadcast)---------------------------
(Continue reading)


Gmane