Joe Buck | 1 May 02:06 2010

Re: memcpy(p,p,len)

On Fri, Apr 30, 2010 at 08:29:19AM -0700, Richard Guenther wrote:
> On Fri, Apr 30, 2010 at 5:05 PM, Joe Buck <Joe.Buck <at> synopsys.com> wrote:
> > On Fri, Apr 30, 2010 at 07:30:33AM -0700, Mark Mielke wrote:
> >> Just a quick comment than Jan-Benedict's opinion is widely shared by the
> >> specification and by the Linux glibc manpage:
> >>
> >> DESCRIPTION
> >>         The  memcpy()  function  copies  n bytes from memory area src to
> >> memory
> >>         area dest.  The memory areas should not overlap.  Use memmove(3)
> >> if the
> >>         memory areas do overlap.
> >>
> >> It doesn't matter if it sometimes works. Sometimes works programs are
> >> sometimes doesn't work programs. :-)
> >
> > The typical memcpy function will fail for overlapping but unequal memory
> > ranges, but will work for src == dst.  Switching to memmove would degrade
> > performance, and that should only be done if there is an actual, rather
> > than a theoretical bug.  Note that for this use, it's not possible (if
> > the program is valid) for the ranges to overlap but be unequal.
> >
> > Another alternative is that instead of using memcpy, a specialized
> > function could be used that has the required property (the glibc
> > memcpy does).
> 
> Note that language semantics come in here as well.  The middle-end
> assumes that when an assignment is not BLKmode that the RHS
> will be read before the lhs will be written.  It does not assume so
> otherwise and the behavior is undefined for overlapping *p and *q
(Continue reading)

Picon

gcc.3.4.6 vs. gcc-4.3.2 re: pseudo instructions & bus error

Hi, I compile .c files using both gcc.3.4.6 and gcc-4.3.2 chaining to Sun's 
assembler "Sun Compiler Common 10 Patch 09/04/2007" in both cases:

  gcc -O3 -D_SOLARIS -D_SPARC -Wall -Wa,-xarch=v8plus -fexceptions -c ...

I run on a "SunOS 5.10 Generic_137111-08 sun4v sparc SUNW,T5240" box with 
16x1582MHz (UltraSPARC-T2+) procesors. The gcc.3.4.6 build works. The gcc-4.3.2 
cores with a BUS error. 

How can I turn pseudo assembler instructions off and/or what switch can I add 
to the gcc line to make this BUS error go away? Either the assembler is wrong 
and/or gcc aligned a variable on the wrong boundary. Details follow.

==========
The key difference appears to be that 3.4.6 does not emit pseudo instructions 
which work on extended 8-byte words. 4.3.2 does. In this case 4.3.2 emits stx 
which is output as clrx i.e. appears in objdump of the .o file. That particular 
code fragment is the assembler equivalent of "tm->counter = 0" except counter 
is on a 4-byte boundary hence bus error:

    .align 4
    .global init
    .type   init, #function
    .proc   012
init:
.LLFB4:
    save    %sp, -112, %sp
.LLCFI9:
    st  %g0, [%i0+28]
    st  %g0, [%i0+32]
(Continue reading)

Ian Lance Taylor | 1 May 09:01 2010
Picon

Re: gcc.3.4.6 vs. gcc-4.3.2 re: pseudo instructions & bus error

"SHANE MILLER, BLOOMBERG/ 731 LEXIN" <smiller53 <at> bloomberg.net> writes:

> Hi, I compile .c files using both gcc.3.4.6 and gcc-4.3.2 chaining to Sun's 
> assembler "Sun Compiler Common 10 Patch 09/04/2007" in both cases:
>
>   gcc -O3 -D_SOLARIS -D_SPARC -Wall -Wa,-xarch=v8plus -fexceptions -c ...

> How can I turn pseudo assembler instructions off and/or what switch can I add 
> to the gcc line to make this BUS error go away? Either the assembler is wrong 
> and/or gcc aligned a variable on the wrong boundary. Details follow.

>     stx %g0, [%i0+56]                  <-  BUS ERROR HERE. 8 byte operation
>     mov %i1, %o0                           on memory on a non-8 byte boundary

> Contrast with the 3.4.6 output:

>     mov 0, %o4                         <- no stx: this works
>     mov 0, %o5

This message is inappropriate for the gcc <at> gcc.gnu.org mailing list,
which is for gcc development.  It would be appropriate for the
gcc-help <at> gcc.gnu.org mailing list.  Please take any followups to
gcc-help.  Thanks.

gcc will only use the stx instruction when it believes that the input
is 8 byte aligned.  So either gcc is confused or your code is
incorrect.  It's hard to say anything else without seeing some of your
code.

A possibly relevant command line option is -mno-unaligned-doubles,
(Continue reading)

Jan Hubicka | 1 May 11:36 2010
Picon

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

> 
> Vortex needs -fno-strict-aliasing.  It casts between two record types
> with one record being a 'prefix' of another.

So today runs are complette.  Thanks to Richi who fixed ICE in symtab merging
that affected perl and GCC.  With vortex problem was that in addition to
-fno-strict-aliasing it is writting to closed files that cause ICE depending on
partiuclar glibc version.

Comparing http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-fdo-64-FDO/recent.html
vortex is 2036 with -O2 -flto, 2438 with -O2 -flto and FDO (so about 20% improvement)
http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-head-64/list.html
has -O2 runs without LTO that is 1859, so 31% for LTO+FDO, 10% LTO.

Any idea if it is one of value transforms or just edge profile making the
difference?  There are some cases of write only globals we can constant
propagate with -fwhole-program in SPEC, but I think it is parser.

Honza
> 
> David
> 
> 
> 
> > Honza
> >

redriver jiang | 1 May 12:14 2010
Picon

Re: The usage of the "clobber "match_scratch""

Thanks.

And for
>>But I can't see any reason to allocate a fixed scratch register.

The ACC register here I use is not a fixed register for GCC. I make
ACC register to be suitable for QImode operands only.

2010/4/27 Ian Lance Taylor <iant <at> google.com>:
> redriver jiang <jiang.redriver <at> gmail.com> writes:
>
>>  test3.c:27: error: insn does not satisfy its constraints:
>> (insn 52 51 32 0 (parallel [
>>             (set (reg:HI 16 BASE0)
>>                 (plus:HI (reg:HI 16 BASE0)
>>                     (const_int -2 [0xfffffffe])))
>>             (clobber (scratch:QI))
>>         ]) 9 {*addhi3} (nil)
>>     (expr_list:REG_EQUIV (plus:HI (reg/f:HI 20 BASE2)
>>             (const_int -2 [0xfffffffe]))
>>         (nil)))
>> test3.c:27: internal compiler error: in reload_cse_simplify_operands,
>> at postreload.c:391
>
> Looks like this insn didn't get a register at all.  Reload can
> sometimes generate add insns directly, which could perhaps cause this
> to happen.
>
>
>> I think I may not understand the usage of "match_scratch" properly,
(Continue reading)

Richard Sandiford | 1 May 19:22 2010

Re: split lui_movf pattern on mips?

"Amker.Cheng" <amker.cheng <at> gmail.com> writes:
> HI:
>    There is comment on lui_movf in mips.md like following,
>
> ;; because we don't split it.  FIXME: we should split instead.
>
> I can split it into a move and a condmove(movesi_on_cc) insns , like
>
> (define_split
>  [(set (match_operand:CC 0 "d_operand" "")
>        (match_operand:CC 1 "fcc_reload_operand" ""))]
>  "reload_completed && ISA_HAS_8CC && TARGET_HARD_FLOAT && ISA_HAS_CONDMOVE
>  && !CANNOT_CHANGE_MODE_CLASS(CCmode, SImode,
>
> REGNO_REG_CLASS(REGNO(operands[0])))"
>  [(set (match_dup 2) (match_dup 3))
>   (set (match_dup 2)
>        (if_then_else:SI
>           (eq:SI (match_dup 1)
>                  (match_dup 4))
>           (match_dup 2)
>           (match_dup 4)))]
>  "
>  {
>    operands[2] = gen_rtx_REG(SImode, REGNO(operands[0]));
>    operands[3] = GEN_INT(0x3f800000);
>    operands[4] = const0_rtx;
>  }
>  ")
>
(Continue reading)

gccadmin | 2 May 00:48 2010
Picon

gcc-4.6-20100501 is now available

Snapshot gcc-4.6-20100501 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20100501/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 158965

You'll find:

gcc-4.6-20100501.tar.bz2              Complete GCC (includes all of below)

gcc-core-4.6-20100501.tar.bz2         C front end and core compiler

gcc-ada-4.6-20100501.tar.bz2          Ada front end and runtime

gcc-fortran-4.6-20100501.tar.bz2      Fortran front end and runtime

gcc-g++-4.6-20100501.tar.bz2          C++ front end and runtime

gcc-java-4.6-20100501.tar.bz2         Java front end and runtime

gcc-objc-4.6-20100501.tar.bz2         Objective-C front end and runtime

gcc-testsuite-4.6-20100501.tar.bz2    The GCC testsuite

Diffs from 4.6-20100424 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.
(Continue reading)

Xinliang David Li | 2 May 09:04 2010
Picon

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

On Sat, May 1, 2010 at 2:36 AM, Jan Hubicka <hubicka <at> ucw.cz> wrote:
>>
>> Vortex needs -fno-strict-aliasing.  It casts between two record types
>> with one record being a 'prefix' of another.
>
> So today runs are complette.  Thanks to Richi who fixed ICE in symtab merging
> that affected perl and GCC.  With vortex problem was that in addition to
> -fno-strict-aliasing it is writting to closed files that cause ICE depending on
> partiuclar glibc version.
>
> Comparing http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-fdo-64-FDO/recent.html
> vortex is 2036 with -O2 -flto, 2438 with -O2 -flto and FDO (so about 20% improvement)
> http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-head-64/list.html
> has -O2 runs without LTO that is 1859, so 31% for LTO+FDO, 10% LTO.
>
> Any idea if it is one of value transforms or just edge profile making the
> difference?  There are some cases of write only globals we can constant
> propagate with -fwhole-program in SPEC, but I think it is parser.
>

I got the following number for O2, FDO, and LIPO : 2351, 2761 (17%), 3448 (24%).

The FDO improvement over O2 come from both edge profile and vpt
(div,rem). With FDO, one of the important loops in Part_Delete may get
tail duplicated which helps performance.

LIPO improvement mainly come from cross module ininling of hot
functions Mem_GetWord, Mem_GetAddr, Chunk_ChkGetChunk.

David
(Continue reading)

Jan Hubicka | 2 May 15:45 2010
Picon

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

> On Sat, May 1, 2010 at 2:36 AM, Jan Hubicka <hubicka <at> ucw.cz> wrote:
> >>
> >> Vortex needs -fno-strict-aliasing.  It casts between two record types
> >> with one record being a 'prefix' of another.
> >
> > So today runs are complette.  Thanks to Richi who fixed ICE in symtab merging
> > that affected perl and GCC.  With vortex problem was that in addition to
> > -fno-strict-aliasing it is writting to closed files that cause ICE depending on
> > partiuclar glibc version.
> >
> > Comparing http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-fdo-64-FDO/recent.html
> > vortex is 2036 with -O2 -flto, 2438 with -O2 -flto and FDO (so about 20% improvement)
> > http://gcc.opensuse.org/SPEC/CINT/sb-frescobaldi.suse.de-head-64/list.html
> > has -O2 runs without LTO that is 1859, so 31% for LTO+FDO, 10% LTO.
> >
> > Any idea if it is one of value transforms or just edge profile making the
> > difference?  There are some cases of write only globals we can constant
> > propagate with -fwhole-program in SPEC, but I think it is parser.
> >
> 
> I got the following number for O2, FDO, and LIPO : 2351, 2761 (17%), 3448 (24%).
> 
> The FDO improvement over O2 come from both edge profile and vpt
> (div,rem). With FDO, one of the important loops in Part_Delete may get

I see.  I am particularly interested in the div/rem transform.  With LTO such
things are sometimes doable at compile time (propagating that the divisor is
know constant value).  We currently make no constant propagation across global
variables except for simple detection if it is readonly and initialized.  It
would be possible to be a bit smarter here and look for vars that are only used
(Continue reading)

renato.astorino | 2 May 16:34 2010
Picon

Function definition within function

Dear,

Without wishing to be precious and not to cause controversy,

I wonder if the following statements is incorrect or if I am wrong in my interpretation,

or is a hidden feature of the compiler gcc.

In the book 'The C Programming Language' second edition of the authors

Brian W. Kernighan and Dennis M. Ritchie is made the following statement:

'4.8 Block Structure

C is not a block-structured language in the sense of Pascal or similar languages,

because functions defined may not be within other functions.'

In the book 'C The Complete Reference' fourth edition of the author Herbert Schildt

is made the following statement in a note:

'The reason C that is not, technically, the block-structured language is that blockstructured

languages permit procedures or functions to be declared inside other functions or procedures.

However, since C does not allow the creation of functions within functions,

it can not be formally called block-structured.'

(Continue reading)


Gmane