Alvaro Herrera | 1 Nov 2005 01:02
Picon
Gravatar

Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)

Jim C. Nasby wrote:
> Now that I've got a little better idea of what this code does, I've
> noticed something interesting... this issue is happening on an 8-way
> machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
> greatly increase the odds of buffer conflicts? Bug aside, would it be
> better to set NUM_SLRU_BUFFERS higher for a larger number of CPUs?

We had talked about increasing NUM_SLRU_BUFFERS depending on
shared_buffers, but it didn't get done.  Something to consider for 8.2
though.  I think you could have better performance by increasing that
setting, while at the same time dimishing the possibility that the race
condition appears.

I think you should also consider increasing PGPROC_MAX_CACHED_SUBXIDS
(src/include/storage/proc.h), because that should decrease the chance
that the subtrans area needs to be scanned.  By how much, however, I
wouldn't know -- it depends on the number of subtransactions you
typically have; I guess you could activate the measuring code in
procarray.c to get a figure.

--

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
www.google.com: interfaz de lĂ­nea de comando para la web.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

(Continue reading)

Mark Wong | 1 Nov 2005 01:10

Re: Spinlocks, yet again: analysis and proposed patches

On Thu, 20 Oct 2005 23:03:47 +0100
Simon Riggs <simon <at> 2ndquadrant.com> wrote:

> On Wed, 2005-10-19 at 14:07 -0700, Mark Wong wrote:
> > > 
> > > This isn't exactly elegant coding, but it provides a useful improvement
> > > on an 8-way SMP box when run on 8.0 base. OK, lets be brutal: this looks
> > > pretty darn stupid. But it does follow the CPU optimization handbook
> > > advice and I did see a noticeable improvement in performance and a
> > > reduction in context switching.
> 
> > > I'm not in a position to try this again now on 8.1beta, but I'd welcome
> > > a performance test result from anybody that is. I'll supply a patch
> > > against 8.1beta for anyone wanting to test this.
> > 
> > Ok, I've produce a few results on a 4 way (8 core) POWER 5 system, which
> > I've just set up and probably needs a bit of tuning.  I don't see much
> > difference but I'm wondering if the cacheline sizes are dramatically
> > different from Intel/AMD processors.  I still need to take a closer look
> > to make sure I haven't grossly mistuned anything, but I'll let everyone
> > take a look:
> 
> Well, the Power 5 architecture probably has the lowest overall memory
> delay you can get currently so in some ways that would negate the
> effects of the patch. (Cacheline is still 128 bytes, AFAICS). But it's
> clear the patch isn't significantly better (like it was with 8.0 when we
> tried this on the 8-way Itanium in Feb).
> 
> > cvs 20051013
> > http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/19/
(Continue reading)

Jim C. Nasby | 1 Nov 2005 01:56

Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)

On Mon, Oct 31, 2005 at 09:02:59PM -0300, Alvaro Herrera wrote:
> Jim C. Nasby wrote:
> > Now that I've got a little better idea of what this code does, I've
> > noticed something interesting... this issue is happening on an 8-way
> > machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
> > greatly increase the odds of buffer conflicts? Bug aside, would it be
> > better to set NUM_SLRU_BUFFERS higher for a larger number of CPUs?
> 
> We had talked about increasing NUM_SLRU_BUFFERS depending on
> shared_buffers, but it didn't get done.  Something to consider for 8.2
> though.  I think you could have better performance by increasing that
> setting, while at the same time dimishing the possibility that the race
> condition appears.

Ok, I'll look into that. This database is definately having issues due
to the sheer transaction volume, so maybe that will help.

If NUM_SLRU_BUFFERS were to be tied to something, wouldn't it make more
sense to tie it to wal_buffers though? One example is a data warehouse
might have a very high shared_buffers, but most likely won't have a high
transaction rate. ISTM that most databases with a high transaction rate
are likely to have increased wal_buffers.

> I think you should also consider increasing PGPROC_MAX_CACHED_SUBXIDS
> (src/include/storage/proc.h), because that should decrease the chance
> that the subtrans area needs to be scanned.  By how much, however, I
> wouldn't know -- it depends on the number of subtransactions you
> typically have; I guess you could activate the measuring code in
> procarray.c to get a figure.

(Continue reading)

Andrew Dunstan | 1 Nov 2005 02:19
Gravatar

regression failures on WIndows in machines with some non-English locales


I have become aware that regression is failing due to ordering 
differences on Windows machines in some non-English locales 
(specifically, Czech, but the potential is there for more failures).

The problem seems to be that the regression suite and initdb don't do 
enough between them to ensure that the tests are run in C locale.

The simple solution seems to be to add --no-locale to the initdb args in 
pg_regress.sh. I have asked Petr Jelinek (one of our Czech users) to 
test this. If it works as I expect it to (buildfarm has done this for 
installcheck tests for some time)  I'd like to add this to both the HEAD 
and 8.0 branches. I know it's very late in the cycle, but it seems very 
low risk to me, and I'd like to have regression working on as broad a 
set of platforms as possible.

If people prefer, I could add it just for the Windows case - Unix 
platforms won't see the effect I propose to remedy, as their setlocale 
works from the environment, unlike Windows.

Thoughts?

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
(Continue reading)

Tom Lane | 1 Nov 2005 04:12
Picon

Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)

"Jim C. Nasby" <jnasby <at> pervasive.com> writes:
> AFAIK they're not using subtransactions at all, but I'll check.

Well, yeah, they are ... else you'd never have seen this failure.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Tom Lane | 1 Nov 2005 04:38
Picon

Re: regression failures on WIndows in machines with some non-English locales

Andrew Dunstan <andrew <at> dunslane.net> writes:
> The simple solution seems to be to add --no-locale to the initdb args in 
> pg_regress.sh.

Er ... what exactly does that do that setting LC_ALL=C doesn't?

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Petr Jelinek | 1 Nov 2005 05:28
Picon
Favicon

Re: regression failures on WIndows in machines with some non-English

Tom Lane wrote:
> 
>>The simple solution seems to be to add --no-locale to the initdb args in 
>>pg_regress.sh.
> 
> 
> Er ... what exactly does that do that setting LC_ALL=C doesn't?
> 

Windows are ignoring locale enviroment variables so you can't do that

--

-- 
Regards
Petr Jelinek (PJMODOS)

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Simon Riggs | 1 Nov 2005 08:32
Favicon
Gravatar

Re: Spinlocks, yet again: analysis and proposed patches

On Mon, 2005-10-31 at 16:10 -0800, Mark Wong wrote:
> On Thu, 20 Oct 2005 23:03:47 +0100
> Simon Riggs <simon <at> 2ndquadrant.com> wrote:
> 
> > On Wed, 2005-10-19 at 14:07 -0700, Mark Wong wrote:
> > > > 
> > > > This isn't exactly elegant coding, but it provides a useful improvement
> > > > on an 8-way SMP box when run on 8.0 base. OK, lets be brutal: this looks
> > > > pretty darn stupid. But it does follow the CPU optimization handbook
> > > > advice and I did see a noticeable improvement in performance and a
> > > > reduction in context switching.
> > 
> > > > I'm not in a position to try this again now on 8.1beta, but I'd welcome
> > > > a performance test result from anybody that is. I'll supply a patch
> > > > against 8.1beta for anyone wanting to test this.
> > > 
> > > Ok, I've produce a few results on a 4 way (8 core) POWER 5 system, which
> > > I've just set up and probably needs a bit of tuning.  I don't see much
> > > difference but I'm wondering if the cacheline sizes are dramatically
> > > different from Intel/AMD processors.  I still need to take a closer look
> > > to make sure I haven't grossly mistuned anything, but I'll let everyone
> > > take a look:
> > 
> > Well, the Power 5 architecture probably has the lowest overall memory
> > delay you can get currently so in some ways that would negate the
> > effects of the patch. (Cacheline is still 128 bytes, AFAICS). But it's
> > clear the patch isn't significantly better (like it was with 8.0 when we
> > tried this on the 8-way Itanium in Feb).
> > 
> > > cvs 20051013
(Continue reading)

Stefan Kaltenbrunner | 1 Nov 2005 08:34

Re: 8.1 Release Candidate 1 Coming ...

Mag Gam wrote:
> Is this issue only on AIX 5.3 ML1 thru ML 3?
> Does the build work fine with 5.2 (ALL MLs)?

5.3 ML1 works but it is affected by the System include Bug mentioned in
our AIX-FAQ. ML3 is supposed to fix that specific problem but breaks in
another more difficult way as it seems ...

Stefan

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

strk | 1 Nov 2005 12:32
Favicon

FreeBSD broke with autoconf-based build

I'm having troubles building postgis HEAD on freebsd
using the new autoconf-based scripts.

The Makefile.shlib file copied by pgsql sources
adds a -Bforcearchive flag to LINK.shared with
arch is freebsd, but the flag seems to be
unsupported (this is from 7.2.1 to 8.0.0)

Weird enough PostgreSQL build works fine
(no -Bforcearchive flag used)

If I remove the -Bforcearchive flag from Makefile.shlib
everything seems to work fine.

I suppose (but didn't test) that this would also fail with
pgxs.  Are there freebsd users around to test this ?

--strk;

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq


Gmane