Ivan Maidanski | 1 Dec 2010 21:47
Picon

Re[13]: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hi all,

It seems the observed degradation can be discovered by 2 tests:
1) by benchmarking v71 vs v72a2+test2_patch;
2) by benchmarking v71 vs v72a2+test3_patch.

test2 patch reverts the relevant changes of:
2008-08-21  Hans Boehm <Hans.Boehm@...>

test3 patch reverts the relevant changes of:
2009-05-22  Hans Boehm <Hans.Boehm@...> (Largely from Ludovic Cortes)

Regards.

Wed, 01 Dec 2010 23:10:56 +0300  Ivan Maidanski <ivmai@...>:

> Hi,
> 
> So, there is no real difference in speed between 7.1 and 7.1+patch, right?
> 
> PS. I'm preparing some more test patches...
> 
> Wed, 1 Dec 2010 09:27:24 +0100 Manuel.Serrano@...:
> 
> > Here we are:
> > 
> > 7.2a4 7.2a2 7.1   7.0   7.0a7 6.8     7.1+ivan-30nov
> > bague     0.76  0.77  0.77  0.76  0.77  0.77    0.77
> > beval     1.33  1.41  1.29  1.41  1.29  1.31    1.44
> > boyer     2.23  2.23  2.13  2.14  2.13  2.15    2.13
(Continue reading)

Ludovic Courtès | 2 Dec 2010 21:53
Picon

Re: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hi Ivan,

Ivan Maidanski <ivmai@...> writes:

> It seems the observed degradation can be discovered by 2 tests:
> 1) by benchmarking v71 vs v72a2+test2_patch;
> 2) by benchmarking v71 vs v72a2+test3_patch.
>
> test2 patch reverts the relevant changes of:
> 2008-08-21  Hans Boehm <Hans.Boehm@...>
>
> test3 patch reverts the relevant changes of:
> 2009-05-22  Hans Boehm <Hans.Boehm@...> (Largely from Ludovic Cortes)

Damn, I feel guilty now.  ;-)

Did you measure the effect of each patch individually?  It would be
interesting to know.

For the record, the discussion that led to the second patch started here:

  http://thread.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/2570

The next-to-final patch was posted here:

  http://thread.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/2634

The intent was to /exclude/ ELF sections containing relocated read-only
data from the GC roots on GNU systems, thereby reducing the amount of
memory that needs to be scanned.
(Continue reading)

Manuel.Serrano | 3 Dec 2010 09:37
Picon
Picon
Favicon

Re: Fwd: Re[13]: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hi Ivan,

> 2 patches are attached (I'm sorry if you didn't received them earlier - I sent them to the mailing list).
Yes? I have missed them there too. Strange...

> please compare 2 patches independently (I still dont know exactly where the problem is).
It looks like the performance slowdown comes from something else.
test2 and test3 behave has 7.2a2 and 7.2a4.

          7.2a4 7.2a2 7.1   7.0   7.0a7 6.8     7.1+ivan-30nov  7.2a2-test2  7.2a2-test3
bague     0.76  0.77  0.77  0.76  0.77  0.77    0.77            0.77         0.77
beval     1.33  1.41  1.29  1.41  1.29  1.31    1.44            1.42         1.42
boyer     2.23  2.23  2.13  2.14  2.13  2.15    2.13            2.24         2.23
cgc       0.47  0.48  0.48  0.47  0.48  0.46    0.47            0.48         0.49
conform   1.91  1.91  1.74  1.72  1.73  1.79    1.71            1.92         1.92
earley    2.49  2.50  2.08  2.13  2.09  2.23    2.09            2.52         2.52
fib       0.01  0.01  0.01  0.01  0.01  0.01    0.01            0.01         0.01
fft       2.51  2.52  2.52  2.50  2.52  2.49    2.5             2.52         2.52
leval     1.12  1.13  1.05  1.01  1.02  1.09    1.02            1.14         1.13
maze      1.67  1.40  1.36  1.35  1.26  1.39    1.35            1.38         1.39
mbrot     7.03  7.05  7.04  7.03  7.05  7.05    7.02            7.06         7.07
nucleic   1.18  1.20  1.20  1.16  1.16  1.34    1.17            1.2          1.21
peval     1.46  1.47  1.20  1.19  1.20  1.18    1.2             1.47         1.49
puzzle    1.96  1.92  1.97  1.96  1.92  1.93    1.92            1.93         1.92
queens    2.29  2.29  1.55  1.56  1.55  1.44    1.56            2.36         2.36
qsort     1.65  1.64  1.63  1.62  1.63  1.63    1.63            1.65         1.65
rgc       1.28  1.28  1.23  1.23  1.24  1.28    1.23            1.29         1.28
sieve     1.58  1.60  1.44  1.42  1.41  1.51    1.43            1.59         1.59
traverse  5.14  5.15  3.55  3.60  3.56  3.58    3.59            5.13         5.15
almabench 1.45  1.45  1.45  1.45  1.45  1.46    1.45            1.46         1.46
(Continue reading)

Ivan Maidanski | 3 Dec 2010 22:07
Picon

Re: Re: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hi Ludovic,

Please don't hurry to blame yourself ;)

Strange but the bigloo tests show the degradation problem has nothing with the patches below.

Thu, 02 Dec 2010 21:53:09 +0100 ludo@... (Ludovic CourtХs):

> Hi Ivan,
> 
> Ivan Maidanski <ivmai@...> writes:
> 
> > It seems the observed degradation can be discovered by 2 tests:
> > 1) by benchmarking v71 vs v72a2+test2_patch;
> > 2) by benchmarking v71 vs v72a2+test3_patch.
> >
> > test2 patch reverts the relevant changes of:
> > 2008-08-21  Hans Boehm <Hans.Boehm@...>
> >
> > test3 patch reverts the relevant changes of:
> > 2009-05-22  Hans Boehm <Hans.Boehm@...> (Largely from Ludovic
> Cortes)
> 
> Damn, I feel guilty now.  ;-)
> 
> Did you measure the effect of each patch individually?  It would be
> interesting to know.
> 
> For the record, the discussion that led to the second patch started here:
> 
(Continue reading)

Ivan Maidanski | 3 Dec 2010 22:27
Picon

Re[15]: Fwd: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hi Manuel,

It's strange...

Please confirm that you don't compile GC (for this benchmark) with multi-threading support and don't use
GC_DEBUG (and GC_debug_ routines).

If yes, then the only difference between gc71+test1_patch and gc72a2+test2_patch+test3_patch is in
GC_clear-a_few_frames() (in alloc.c). Please benchmark gc72a2+test4_patch (which is attached).

Regards.

Fri, 3 Dec 2010 09:37:56 +0100 письмо от Manuel.Serrano <at> inria.fr:

> Hi Ivan,
> 
> > 2 patches are attached (I'm sorry if you didn't received them
> earlier - I sent them to the mailing list).
> Yes? I have missed them there too. Strange...
> 
> > please compare 2 patches independently (I still dont know exactly where
> the problem is).
> It looks like the performance slowdown comes from something else.
> test2 and test3 behave has 7.2a2 and 7.2a4.
> 
> 7.2a4 7.2a2 7.1   7.0   7.0a7 6.8     7.1+ivan-30nov  7.2a2-test2  7.2a2-test3
> bague     0.76  0.77  0.77  0.76  0.77  0.77    0.77            0.77        
> 0.77
> beval     1.33  1.41  1.29  1.41  1.29  1.31    1.44            1.42        
> 1.42
(Continue reading)

Manuel.Serrano | 4 Dec 2010 05:38
Picon
Picon
Favicon

Re: Re[15]: Fwd: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hi Ivan,

> Please confirm that you don't compile GC (for this benchmark) with
> multi-threading support and don't use GC_DEBUG (and GC_debug_
> routines).
I do confirm that if I use these I'm not aware of it. On our side, there
is a change between 7.1 and 7.2: up to 7.1 we were using Makefile.direct
and from 7.2 we switched to using the generated Makefile. The
generated Makefile adds many compilation options. For the sake of the
example, here the command issued by Make to compile the file alloc.c of
the version 7.2

gcc -DPACKAGE_NAME=\"gc\" -DPACKAGE_TARNAME=\"gc\" -DPACKAGE_VERSION=\"7.2alpha2\"
"-DPACKAGE_STRING=\"gc 7.2alpha2\""
-DPACKAGE_BUGREPORT=\"Hans.Boehm@...\" -DGC_VERSION_MAJOR=7
-DGC_VERSION_MINOR=2 -DGC_ALPHA_VERSION=2 -DPACKAGE=\"gc\" -DVERSION=\"7.2alpha2\"
-DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
-DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1
-DHAVE_DLFCN_H=1 -DNO_EXECUTE_PERMISSION=1 -DALL_INTERIOR_POINTERS=1
-DATOMIC_UNCOLLECTABLE=1 -I./include -fexceptions -I libatomic_ops/src -O3 -DNO_DEBUGGING
-Iinclude -Ilibatomic_ops-install/include -DFINALIZE_ON_DEMAND
-I/misc/lab/bigloo/bench/3.5b-7.2alpha2-test2/bigloo3.5b/lib/3.5b -fPIC -fPIC
-I/misc/lab/bigloo/bench/3.5b-7.2alpha2-test2/bigloo3.5b/lib/3.5b -MT alloc.lo -MD -MP -MF
.deps/alloc.Tpo -c alloc.c -o alloc.o >/dev/null 2>&1

For the version 7.1, we have:

gcc  -O3  -DNO_DEBUGGING -Iinclude -Ilibatomic_ops-install/include -DFINALIZE_ON_DEMAND 
-I/misc/lab/bigloo/bench/3.5b-7.1/bigloo3.5b/lib/3.5b -fPIC   -c -o alloc.o alloc.c

(Continue reading)

Manuel.Serrano | 4 Dec 2010 06:07
Picon
Picon
Favicon

Re: Re[15]: Fwd: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

ALL_INTERIOR is not the reason for the different performance because
with 7.2alpha4 the Makefile issues commands such as:

/bin/sh ./libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H   -I./include -I./include   -fexceptions
-O3  -DNO_DEBUGGING -Iinclude -Ilibatomic_ops-install/include -DFINALIZE_ON_DEMAND 
-I/misc/lab/bigloo/bench/3.5b-7.2alpha4/bigloo3.5b/lib/3.5b -fPIC -MT alloc.lo -MD -MP -MF
.deps/alloc.Tpo -c -o alloc.lo alloc.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I./include -I./include -fexceptions -O3 -DNO_DEBUGGING
-Iinclude -Ilibatomic_ops-install/include -DFINALIZE_ON_DEMAND
-I/misc/lab/bigloo/bench/3.5b-7.2alpha4/bigloo3.5b/lib/3.5b -fPIC -MT alloc.lo -MD -MP -MF
.deps/alloc.Tpo -c alloc.c  -fPIC -DPIC -o .libs/alloc.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I./include -I./include -fexceptions -O3 -DNO_DEBUGGING
-Iinclude -Ilibatomic_ops-install/include -DFINALIZE_ON_DEMAND
-I/misc/lab/bigloo/bench/3.5b-7.2alpha4/bigloo3.5b/lib/3.5b -fPIC -MT alloc.lo -MD -MP -MF
.deps/alloc.Tpo -c alloc.c -o alloc.o >/dev/null 2>&1
mv -f .deps/alloc.Tpo .deps/alloc.Plo

--

-- 
Manuel
_______________________________________________
Gc mailing list
Gc@...
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Ivan Maidanski | 4 Dec 2010 07:56
Picon

Re[17]: Fwd: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hello Manuel,

1. Also, the command line for v72a2 includes -DNO_EXECUTE_PERMISSION -fexceptions.

2.  Please also do benchmarking with that small patch I've attached in my previous post.

Sat, 4 Dec 2010 06:07:27 +0100 Manuel.Serrano@...:

> ALL_INTERIOR is not the reason for the different performance because
> with 7.2alpha4 the Makefile issues commands such as:
> 
> /bin/sh ./libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H   -I./include
> -I./include   -fexceptions -O3  -DNO_DEBUGGING -Iinclude
> -Ilibatomic_ops-install/include -DFINALIZE_ON_DEMAND 
> -I/misc/lab/bigloo/bench/3.5b-7.2alpha4/bigloo3.5b/lib/3.5b -fPIC -MT alloc.lo
> -MD -MP -MF .deps/alloc.Tpo -c -o alloc.lo alloc.c
> libtool: compile:  gcc -DHAVE_CONFIG_H -I./include -I./include -fexceptions
> -O3 -DNO_DEBUGGING -Iinclude -Ilibatomic_ops-install/include
> -DFINALIZE_ON_DEMAND
> -I/misc/lab/bigloo/bench/3.5b-7.2alpha4/bigloo3.5b/lib/3.5b -fPIC -MT alloc.lo
> -MD -MP -MF .deps/alloc.Tpo -c alloc.c  -fPIC -DPIC -o .libs/alloc.o
> libtool: compile:  gcc -DHAVE_CONFIG_H -I./include -I./include -fexceptions
> -O3 -DNO_DEBUGGING -Iinclude -Ilibatomic_ops-install/include
> -DFINALIZE_ON_DEMAND
> -I/misc/lab/bigloo/bench/3.5b-7.2alpha4/bigloo3.5b/lib/3.5b -fPIC -MT alloc.lo
> -MD -MP -MF .deps/alloc.Tpo -c alloc.c -o alloc.o >/dev/null 2>&1
> mv -f .deps/alloc.Tpo .deps/alloc.Plo
> 

Sat, 4 Dec 2010 05:38:32 +0100 Manuel.Serrano@...:
(Continue reading)

Manuel.Serrano | 4 Dec 2010 08:02
Picon
Picon
Favicon

Re: Re[17]: Fwd: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

> 1. Also, the command line for v72a2 includes -DNO_EXECUTE_PERMISSION -fexceptions.
...In the meantime I have checked -fexceptions which has no influence on 
performance. I will check NO_EXECUTE_PERMISSION too.

--

-- 
Manuel
_______________________________________________
Gc mailing list
Gc@...
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Ludovic Courtès | 7 Dec 2010 16:24
Picon

Re: Fwd: Performance of bdwgc7.2 had degraded compared to 6.8 - the patch to test

Hello,

Manuel.Serrano@... writes:

> Could the differences in the compilation options be the reason for the
> performance slowdown? In particular, what's the impact of
> -DALL_INTERIOR_POINTERS=1?

All-interior-pointers can increase execution time and/or heap size in my
experience.  Note that it can also be turned on/off at run time, by
setting ‘GC_all_interior_pointers’ before any call to GC_INIT.

Did you measure the impact of all-interior-pointers?

Thanks,
Ludo’.

Gmane