Walker, James Les | 1 May 2012 14:51

Re: Tuning Postgres 9.1 on Windows

Exactly, if turning off fsync gives me 100 commits/sec then I know where my bottleneck is and I can attack it.
Keep in mind though that I already turned off synchronous commit -- *really* dangerous -- and it didn't
have any effect.

-- Les

-----Original Message-----
From: pgsql-performance-owner <at> postgresql.org [mailto:pgsql-performance-owner <at> postgresql.org]
On Behalf Of Merlin Moncure
Sent: Monday, April 30, 2012 6:04 PM
To: Thomas Kellerer
Cc: pgsql-performance <at> postgresql.org
Subject: Re: [PERFORM] Tuning Postgres 9.1 on Windows

On Mon, Apr 30, 2012 at 5:00 PM, Thomas Kellerer <spam_eater <at> gmx.net> wrote:
> Merlin Moncure wrote on 30.04.2012 23:43:
>
>> Trying turning off fsync in postgrsql.conf to be sure.
>
>
> This is a dangerous advise.
> Turning off fsync can potentially corrupt the database in case of a 
> system failure (e.g. power outage).
>

sure. that said, we're just trying to figure out why he's getting
around 40tps.   since he's only benchmarking test data it's perfectly
ok to do that.

merlin
(Continue reading)

Merlin Moncure | 1 May 2012 15:06
Picon

Re: Tuning Postgres 9.1 on Windows

On Tue, May 1, 2012 at 7:51 AM, Walker, James Les <JAWalker <at> cantor.com> wrote:
> Exactly, if turning off fsync gives me 100 commits/sec then I know where my bottleneck is and I can attack
it. Keep in mind though that I already turned off synchronous commit -- *really* dangerous -- and it didn't
have any effect.

well synchronous commit is not as dangerous:
fsync off + power failure = corrupt database
synchronous commit off + power failure = some lost transactions

still waiting on the ssd model #.  worst case scenario is that you tps
rate is in fact sync bound and you have a ssd without capacitor backed
buffers (for example, the intel 320 has them); the probable workaround
would be to set the drive cache from write through to write back but
it would unsafe in that case.  in other words, tps rates in the triple
digits would be physically impossible.

another less likely scenario is you are having network issues
(assuming you are connecting to the database through tcp/ip).  20
years in, microsoft is still figuring out how to properly configure a
network socket.

merlin

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Walker, James Les | 1 May 2012 15:14

Re: Tuning Postgres 9.1 on Windows

SSD is OCZ-VERTEX3 MI. Controller is LSI SAS2 2008 Falcon. I'm working on installing EDB. Then I can give you
some I/O numbers.

-- Les

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure <at> gmail.com] 
Sent: Tuesday, May 01, 2012 9:07 AM
To: Walker, James Les
Cc: Thomas Kellerer; pgsql-performance <at> postgresql.org
Subject: Re: [PERFORM] Tuning Postgres 9.1 on Windows

On Tue, May 1, 2012 at 7:51 AM, Walker, James Les <JAWalker <at> cantor.com> wrote:
> Exactly, if turning off fsync gives me 100 commits/sec then I know where my bottleneck is and I can attack
it. Keep in mind though that I already turned off synchronous commit -- *really* dangerous -- and it didn't
have any effect.

well synchronous commit is not as dangerous:
fsync off + power failure = corrupt database synchronous commit off + power failure = some lost transactions

still waiting on the ssd model #.  worst case scenario is that you tps rate is in fact sync bound and you have a
ssd without capacitor backed buffers (for example, the intel 320 has them); the probable workaround
would be to set the drive cache from write through to write back but it would unsafe in that case.  in other
words, tps rates in the triple digits would be physically impossible.

another less likely scenario is you are having network issues (assuming you are connecting to the database
through tcp/ip).  20 years in, microsoft is still figuring out how to properly configure a network socket.

merlin
CONFIDENTIAL: This e-mail, including its contents and attachments, if any, are confidential. If you are
(Continue reading)

Andy Colson | 1 May 2012 15:42

Re: Tuning Postgres 9.1 on Windows

On 5/1/2012 8:06 AM, Merlin Moncure wrote:
> On Tue, May 1, 2012 at 7:51 AM, Walker, James Les<JAWalker <at> cantor.com>  wrote:
>> Exactly, if turning off fsync gives me 100 commits/sec then I know where my bottleneck is and I can attack
it. Keep in mind though that I already turned off synchronous commit -- *really* dangerous -- and it didn't
have any effect.
>
> well synchronous commit is not as dangerous:
> fsync off + power failure = corrupt database
> synchronous commit off + power failure = some lost transactions
>
> still waiting on the ssd model #.  worst case scenario is that you tps
> rate is in fact sync bound and you have a ssd without capacitor backed
> buffers (for example, the intel 320 has them); the probable workaround
> would be to set the drive cache from write through to write back but
> it would unsafe in that case.  in other words, tps rates in the triple
> digits would be physically impossible.
>
> another less likely scenario is you are having network issues
> (assuming you are connecting to the database through tcp/ip).  20
> years in, microsoft is still figuring out how to properly configure a
> network socket.
>
> merlin
>

Even if its all local, windows doesnt have domain sockets (correct?), so 
all that traffic still has to go thru some bit of network stack, yes?

-Andy

(Continue reading)

Merlin Moncure | 1 May 2012 15:43
Picon

Re: Tuning Postgres 9.1 on Windows

On Tue, May 1, 2012 at 8:14 AM, Walker, James Les <JAWalker <at> cantor.com> wrote:
> SSD is OCZ-VERTEX3 MI. Controller is LSI SAS2 2008 Falcon. I'm working on installing EDB. Then I can give
you some I/O numbers.

It looks like the ssd doesn't have a nv cache and the raid card is a
simple sas hba (which likely isn't doing much for the ssd besides
masking TRIM).  The OCZ 'pro' versions are the ones with power loss
protection (see:
http://hothardware.com/Reviews/OCZ-Vertex-3-Pro-SandForce-SF2000-Based-SSD-Preview/).
 Note the bullet: "Implements SandForce 2582 Controller with power
loss data protection".  It doesn't look like the Vertex 3 Pro is out
yet.

If my hunch is correct, the issue here is that the drive is being
asked to sync data physically and SSD really don't perform well when
the controller isn't in control of when and how to sync data.  However
full physical sync is the only way to guarantee data is truly safe in
the context of a unexpected power loss (an nv cache is basically a
compromise on this point).

merlin

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Clemens Eisserer | 1 May 2012 16:34
Picon

Any disadvantages of using =ANY(ARRAY()) instead of IN?

Hi,

I am using postgresql as database for a hibernate based java oltp
project and as in previous projects am totally impressed by
postgresql's robustness, performance and feature-richness. Thanks for
this excellent piece of software.

Quite often Hibernate ends up generating queries with a lot of joins
which usually works well, except for queries which load some
additional data based on a previous query (SUBSELECT collections),
which look like:

select ..... from table1 ... left outer join table 15 .... WHERE
table1.id IN (select id .... join table16 ... join table20 WHERE
table20.somevalue=?)

Starting with some amount of joins, the optimizer starts to do quite
suboptimal things like hash-joining huge tables where selctivity would
very low.
I already raised join_collapse_limit and from_collapse_limit, but
after a certain point query planning starts to become very expensive.

However, when using " =ANY(ARRAY(select ...))" instead of "IN" the
planner seems to do a lot better, most likely because it treats the
subquery as a black-box that needs to be executed independently. I've
hacked hibernate a bit to use ANY+ARRAY, and it seems to work a lot
better than using "IN".

However, I am a bit uncertain:
- Is it safe to use ANY(ARRAY(select ...)) when I know the sub-query
(Continue reading)

Tom Lane | 1 May 2012 16:43
Picon

Re: Any disadvantages of using =ANY(ARRAY()) instead of IN?

Clemens Eisserer <linuxhippy <at> gmail.com> writes:
> Quite often Hibernate ends up generating queries with a lot of joins
> which usually works well, except for queries which load some
> additional data based on a previous query (SUBSELECT collections),
> which look like:

> select ..... from table1 ... left outer join table 15 .... WHERE
> table1.id IN (select id .... join table16 ... join table20 WHERE
> table20.somevalue=?)

> Starting with some amount of joins, the optimizer starts to do quite
> suboptimal things like hash-joining huge tables where selctivity would
> very low.
> I already raised join_collapse_limit and from_collapse_limit, but
> after a certain point query planning starts to become very expensive.

What PG version are we talking about here?

> However, when using " =ANY(ARRAY(select ...))" instead of "IN" the
> planner seems to do a lot better, most likely because it treats the
> subquery as a black-box that needs to be executed independently. I've
> hacked hibernate a bit to use ANY+ARRAY, and it seems to work a lot
> better than using "IN".

That doesn't sound like a tremendously good idea to me.  But with
so few details, it's hard to comment intelligently.  Can you provide
a concrete test case?

			regards, tom lane

(Continue reading)

Walker, James Les | 1 May 2012 16:44

Re: Tuning Postgres 9.1 on Windows

I installed the enterprisedb distribution and immediately saw a 400% performance increase. Turning off
fsck made it an order of magnitude better. I'm now peaking at over 400 commits per second. Does that sound right?

If I understand what you're saying, then to sustain this high rate I'm going to need a controller that can
defer fsync requests from the host because it has some sort of battery backup that guarantees the full write.

-- Les

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure <at> gmail.com] 
Sent: Tuesday, May 01, 2012 9:43 AM
To: Walker, James Les
Cc: Thomas Kellerer; pgsql-performance <at> postgresql.org
Subject: Re: [PERFORM] Tuning Postgres 9.1 on Windows

On Tue, May 1, 2012 at 8:14 AM, Walker, James Les <JAWalker <at> cantor.com> wrote:
> SSD is OCZ-VERTEX3 MI. Controller is LSI SAS2 2008 Falcon. I'm working on installing EDB. Then I can give
you some I/O numbers.

It looks like the ssd doesn't have a nv cache and the raid card is a simple sas hba (which likely isn't doing
much for the ssd besides masking TRIM).  The OCZ 'pro' versions are the ones with power loss protection (see:
http://hothardware.com/Reviews/OCZ-Vertex-3-Pro-SandForce-SF2000-Based-SSD-Preview/).
 Note the bullet: "Implements SandForce 2582 Controller with power loss data protection".  It doesn't look
like the Vertex 3 Pro is out yet.

If my hunch is correct, the issue here is that the drive is being asked to sync data physically and SSD really
don't perform well when the controller isn't in control of when and how to sync data.  However full physical
sync is the only way to guarantee data is truly safe in the context of a unexpected power loss (an nv cache is
basically a compromise on this point).

(Continue reading)

Clemens Eisserer | 1 May 2012 17:04
Picon

Re: Any disadvantages of using =ANY(ARRAY()) instead of IN?

Hi Tom,

Thanks for your reply.

> What PG version are we talking about here?
For development I use 9.1.3, on the production server is 8.4.7 -
happens with both cases.

> That doesn't sound like a tremendously good idea to me.
Could you elaborate on the downsides of this approach a bit?

> But with
> so few details, it's hard to comment intelligently.
> Can you provide a concrete test case?

A self contained testcase would take some time to create (and list
members willing to configure and run), so I hope a query as well as an
explain-analyze run will provide more information (done with 9.1.3):
http://pastebin.com/BGRdAPg2

Its kind of the worst-worst case which I will improve later (way too
much relations loaded through join-fetching), but its quite a good way
to show the issue. Replacing the IN with a ANY(ARRAY()) already yields
a way better plan.

Thank you in advance, Clemens

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
(Continue reading)

Merlin Moncure | 1 May 2012 18:45
Picon

Re: Tuning Postgres 9.1 on Windows

On Tue, May 1, 2012 at 9:44 AM, Walker, James Les <JAWalker <at> cantor.com> wrote:
> I installed the enterprisedb distribution and immediately saw a 400% performance increase. Turning off
fsck made it an order of magnitude better. I'm now peaking at over 400 commits per second. Does that sound right?

yeah -- well it's hard to say but that sounds plausible based on what
i know.  it would be helpful to see the queries you're running to get
apples to apples idea of what's going on.

> If I understand what you're saying, then to sustain this high rate I'm going to need a controller that can
defer fsync requests from the host because it has some sort of battery backup that guarantees the full write.

yes --  historically, they way to get your tps rate up was to get a
battery backed cache.  this can give you burst (although not
necessarily sustained) tps rates well above what the drive can handle.
 lately, a few of the better built ssd also have on board capacitors
which provide a similar function and allow the drives to safely hit
high tps rates as well.  take a good look at the intel 320 and 710
drives.

merlin

--

-- 
Sent via pgsql-performance mailing list (pgsql-performance <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Gmane