Chris Hoover | 9 Jan 19:54 2006
Picon

Memory Usage Question

Question,

How exactly is Postgres and Linux use the memory?

I have serveral databases that have multi GB indexes on very large tables.  On our current servers, the indexes can fit into memory but not the data (servers have 8 - 12 GB).  However, my boss is wanting to get new servers for me but does not want to keep the memory requirements as high as they are now (this will allow us to get more servers to spread our 200+ databases over).

Question, if I have a 4GB+ index for a table on a server with 4GB ram, and I submit a query that does an index scan, does Postgres read the entire index, or just read the index until it finds the matching value (our extra large indexes are primary keys).

I am looking for real number to give to my boss the say either having a primary key larger than our memory is bad (and how to clearly justfify it), or it is ok.

If it is ok, what are the trade offs in performance?\

Obviously, I want more memory, but I have to prove the need to my boss since it raises the cost of the servers a fair amount.

Thanks for any help,

Chris

Jim C. Nasby | 14 Jan 01:43 2006

Re: [PERFORM] Stable function being evaluated more than once in a single query

Adding -docs...

On Fri, Jan 13, 2006 at 07:27:28PM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby <at> pervasive.com> writes:
> > Is the issue that the optimizer won't combine two function calls (ie:
> > SELECT foo(..) ... WHERE foo(..)), or is it that sometimes it won't make
> > the optimization (maybe depending on the query plan, for example)?
> 
> What the STABLE category actually does is give the planner permission to
> use the function within an indexscan qualification, eg,
> 	WHERE indexed_column = f(42)
> Since an indexscan involves evaluating the comparison expression just
> once and using its value to search the index, this would be incorrect
> if the expression's value might change from row to row.  (For VOLATILE
> functions, we assume that the correct behavior is the naive SQL
> semantics of actually computing the WHERE clause at each candidate row.)
> 
> There is no function cache and no checking for duplicate expressions.
> I think we do check for duplicate aggregate expressions, but not
> anything else.

In that case I'd say that the sSTABLE section of 32.6 should be changed
to read:

A STABLE function cannot modify the database and is guaranteed to
return the same results given the same arguments for all calls within a
single surrounding query. This category gives the planner permission to
use the function within an indexscan qualification. (Since an indexscan
involves evaluating the comparison expression just once and using its
value to search the index, this would be incorrect if the expression's
value might change from row to row.) There is no function cache and no
checking for duplicate expressions.

I can provide a patch to that effect if it's easier...

On a related note, would it be difficult to recognize multiple calls of
the same function in one query? ISTM that would be a win for all but the
most trivial of functions...
--

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby <at> pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo <at> postgresql.org so that your
       message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo <at> postgresql.org so that your
       message can get through to the mailing list cleanly

Jim C. Nasby | 9 Jan 21:35 2006

Re: Memory Usage Question

On Mon, Jan 09, 2006 at 01:54:48PM -0500, Chris Hoover wrote:
> Question, if I have a 4GB+ index for a table on a server with 4GB ram, and I
> submit a query that does an index scan, does Postgres read the entire index,
> or just read the index until it finds the matching value (our extra large
> indexes are primary keys).

Well, the idea behind an index is that if you need a specific value from
it, you can get there very quickly, reading a minimum of data along the
way. So basically, PostgreSQL won't normally read an entire index.

> I am looking for real number to give to my boss the say either having a
> primary key larger than our memory is bad (and how to clearly justfify it),
> or it is ok.
> 
> If it is ok, what are the trade offs in performance?\
> 
> Obviously, I want more memory, but I have to prove the need to my boss since
> it raises the cost of the servers a fair amount.

Well, if you add a sleep to the following code, you can tie up some
amount of memory, which would allow you to simulate having less memory
available. Though over time I think the kernel might decide to page that
memory out, so it's not perfect.

int main(int argc, char *argv[]) {
    if (!calloc(atoi(argv[1]), 1024*1024)) { printf("Error allocating memory.\n"); }
}

In a nutshell, PostgreSQL and the OS will generally work together to
only cache data that is being used fairly often. In the case of a large
PK index, if you're not actually reading a large distribution of the
values in the index you probably aren't even caching the entire index
even now. There may be some kind of linux tool that would show you what
portion of a file is currently cached, which would help answer that
question (but remember that hopefully whatever parts of the index are
cached by PostgreSQL itself won't also be cached by the OS as well).
--

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby <at> pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq


Gmane