Ahmed Taj elsir | 27 Feb 08:21 2015
Picon

clang\clang++ 3.6.0 don't find C\C++ header , in windows ?

1 hour ago I downloaded llvm-3.6.0-rc4-win32.exe from
http://llvm.org/pre-releases/3.6.0/ .

I tried to compile simple C code that just print "hello" , but it
didn't compile , because clang.exe can't find . when I use
clang-cl.exe with the same code , it worked .

I also have the same problem with clang++ even with , I add -I flag to
GCC (4.9.1) C++ headers , the result:

>C:\Users\One\Desktop>clang++ -I c:\MinGW\x86_64-w64-mingw32\include\c++ >main.cpp -lib=libstdc++

>In file included from main.cpp:1:
>c:\MinGW\x86_64-w64-mingw32\include\c++\iostream:38:10: fatal >error:'bits/c++config.h' file
not found

>#include <bits/c++config.h>
>1 error generated.

I found bits/c++config.h in MinGW folders I add it to the -I flag and other .

I result a link error :

	C:/MinGW/bin/../lib/gcc/x86_64-w64-mingw32/4.9.1/../../../../x86_64-w64-mingw32/
	bin/ld.exe: cannot find -lib=libstdc++
	C:/MinGW/bin/../lib/gcc/x86_64-w64-mingw32/4.9.1/../../../../x86_64-w64-mingw32/
	bin/ld.exe: skipping incompatible
C:/MinGW/bin/../lib/gcc/x86_64-w64-mingw32/4.9
	.1//libstdc++.dll.a when searching for -lstdc+
	C:/MinGW/bin/../lib/gcc/x86_64-w64-mingw32/4.9.1/../../../../x86_64-w64-mingw32/
(Continue reading)

Shankar Easwaran | 27 Feb 06:08 2015

could we enable FileArchive preload later ?

It looks like there are errors in the buildbot runs with 
std::future_error when executing lot of tests, probably we can disable 
it for now to make the buildbot clean
?

http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-ubuntu-13.04/builds/26755/steps/test/logs/stdio 
has more information.

Shankar Easwaran

--

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
Jyoti Rajendra Allur | 26 Feb 15:54 2015

SAFECode testsuite query

Hello All,
I am looking at exploring what benefits SAFECode has to offer over clang S.A and llvm's instrumentation
tools like memory sanitizer and address sanitizer.
I could come up with the following that are not provided in ASAN/MSAN/Clang S.A
-> dangling pointer error and detection 
-> crashes in system libraries due to security vulnerabilities.

In the process, I wanted to run the testsuite of safecode and poolalloc but could not find any documentation
about how to run those testsuite. It would be great if someone pointed that out for me.

Also, were there any design scalability issues due to which SAFECode has not been supported beyond llvm 3.2 ?
If yes, could someone let me know about them?

Thanks.

Regards,
Jyoti Allur

Could someone Does SAFECode I was trying to chec
Nema, Ashutosh | 26 Feb 11:31 2015
Picon

RFC: Loop versioning for LICM

I like to propose a new loop multi versioning optimization for LICM.

For now I kept this for LICM only, but it can be used in multiple places.

The main motivation is to allow optimizations stuck because of memory

alias dependencies. Most of the time when alias analysis is unsure about

memory access and it says may-alias. This un surety from alias analysis restrict

some of the memory based optimizations to proceed further.

We observed some cases with LICM, where things are beyond aliasing.

In cases where alias analysis is unsure we like to use loop versioning as an alternative.

 

Loop Versioning will creates version of the loop with aggressive alias and the other

with conservative (default) alias. Aggressive alias version of loop will have all the

memory access marked as no-alias. These two version of loop will be preceded by a

memory runtime check. This runtime check consists of bound checks for all unique memory

accessed in loop, and it ensures aliasing of memory. Based on this check result at runtime

any of the loops gets executed, if memory is non aliased then aggressive aliasing loop

gets executed, else when memory is aliased then non aggressive aliased version gets executed.

 

By setting no-alias to memory accessed in aggressive alias version of loop, enable other

optimization to continue further.

 

Following are the top level steps:

 

1) Perform loop do versioning feasibility check.

2) If loop is a candidate for versioning then create a memory bound check, by considering

     all the memory access in loop body.

3) Clone original loop and set all memory access as no-alias in new loop.

4) Set original loop & versioned loop as a branch target of runtime check result.

5) Call LICM on aggressive alias versioned of loop(For now LICM is scheduled later and not directly

     called from LoopVersioning pass).

 

Consider following test:

 

     1  int foo(int * var1, int * var2, int * var3, unsigned itr) {

     2    unsigned i = 0, j = 0;

     3    for(; i < itr; i++) {

     4      for(; j < itr; j++) {

     5        var1[j] = itr + i;

     6        var3[i] = var1[j] + var3[i];

     7      }

     8    }

     9  }

 

At line #6 store to var3 can be moved out by LICM(promoteLoopAccessesToScalars)

but because of alias analysis un surety about memory access it unable to move it out.

 

After Loop versioning IR:

 

<Versioned Loop>

for.body3.loopVersion:                            ; preds = %for.body3.loopVersion.preheader, %for.body3.loopVersion

  %indvars.iv.loopVersion = phi i64 [ %indvars.iv.next.loopVersion, %for.body3.loopVersion ], [ %2, %for.body3.loopVersion.preheader ]

  %arrayidx.loopVersion = getelementptr inbounds i32* %var1, i64 %indvars.iv.loopVersion

  store i32 %add, i32* %arrayidx.loopVersion, align 4, !tbaa !1, !alias.scope !11, !noalias !11

  %indvars.iv.next.loopVersion = add nuw nsw i64 %indvars.iv.loopVersion, 1

  %lftr.wideiv.loopVersion = trunc i64 %indvars.iv.loopVersion to i32

  %exitcond.loopVersion = icmp eq i32 %lftr.wideiv.loopVersion, %0

  br i1 %exitcond.loopVersion, label %for.inc11.loopexit38, label %for.body3.loopVersion

 

<Original Loop>

for.body3:                                        ; preds = %for.body3.lr.ph, %for.body3

  %indvars.iv = phi i64 [ %indvars.iv.next, %for.body3 ], [ %2, %for.body3.lr.ph ]

  %arrayidx = getelementptr inbounds i32* %var1, i64 %indvars.iv

  store i32 %add, i32* %arrayidx, align 4, !tbaa !1

  %8 = load i32* %arrayidx7, align 4, !tbaa !1

  %add8 = add nsw i32 %8, %add

  store i32 %add8, i32* %arrayidx7, align 4, !tbaa !1

  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1

  %lftr.wideiv = trunc i64 %indvars.iv to i32

  %exitcond = icmp eq i32 %lftr.wideiv, %0

  br i1 %exitcond, label %for.inc11, label %for.body3

 

In versioned loop difference is visible, 1 store has moved out.

 

Following are some high level details about current implementation:

 

-  LoopVersioning

LoopVersioning is main class which holds multi versioning functionality.

 

- LoopVersioning :: isVersioningBeneficial

Its member to ‘LoopVersioning’

Does feasibility check for loop versioning.

a) Checks layout of loop.

b) Instruction level check.

c) memory checks.

 

- LoopVersioning :: versionizeLoop

a) Clone original loo

b) Create a runtime memory check.

c) Add both loops under runtime check results target.

 

- RuntimeMemoryCheck

This class take cares runtime memory check.

 

- RuntimeMemoryCheck ::createRuntimeCheck

It creates runtime memory check.

 

In this patch used maximum loop nest threshold as 2, and maximum number

of pointers in runtime memory check as 5.

 

Later I like to make this as a utility so others can use it.

 

Requesting to go through patch for detailed approach.

Patch available at http://reviews.llvm.org/D7900

 

Suggestions are comments are welcome.

 

Regards,

Ashutosh

_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
martin krastev | 26 Feb 09:54 2015
Picon

MCJIT generating loads of just-stored constants

Hello,

I end up with the following IR, exhibiting an apparent missed
optimisation opportunity, namely loading of just-stored constants:

...
  %5 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 0
  store i32 1, i32* %5, align 4
  %6 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 1
  store i32 1, i32* %6, align 4
  %7 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 2
  store i32 0, i32* %7, align 4
  %8 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 6
  store i32 2, i32* %8, align 4
  %9 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 8
  store i32 2, i32* %9, align 4
  %10 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 10
  store i32 16, i32* %10, align 4
  %11 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 11
  store i32 16, i32* %11, align 4
  %12 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 12
  store i32 0, i32* %12, align 4
  %13 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 13
  store i32 0, i32* %13, align 4
  %14 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 15
  store i32 8, i32* %14, align 4
  %15 = getelementptr inbounds %class.A* %self, i32 0, i32 9, i32 17
  store i32 0, i32* %15, align 4
  %16 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 0
  %17 = load i32* %16, align 4
  %18 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 3
  %19 = load float* %18, align 4
  %20 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 4
  %21 = load float* %20, align 4
  %22 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 5
  %23 = load float* %22, align 4
  %24 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 6
  %25 = load i32* %24, align 4
  %26 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 7
  %27 = load float* %26, align 4
  %28 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 8
  %29 = load i32* %28, align 4
  %30 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 9
  %31 = load float* %30, align 4
  %32 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 10
  %33 = load i32* %32, align 4
  %34 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 11
  %35 = load i32* %34, align 4
  %36 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 13
  %37 = load i32* %36, align 4
  %38 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 14
  %39 = load float* %38, align 4
  %40 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 15
  %41 = load i32* %40, align 4
  %42 = getelementptr inbounds %class.A* %self, i64 0, i32 9, i32 16
  %43 = load float* %42, align 4
...

The above happens after a callee gets inlined - all the stores are
from the caller, and the loads are from the inlined callee. Please
note the partial overlap between stored and loaded fields.

The general steps leading to the above:

1. Load a module containing a function A::foo(), which function starts
with reading fields from an object of class A.
2. Add to the module a wrapper function bar() which takes as an
argument an object of class A, stores literals to (most of the) fields
of the object, then calls A::foo() with the same object.
3. Update the original A::foo() with an AlwaysInline attribute.
4. Pass the module to MCJIT from clang 3.4.2, set up as:

...
                llvm::PassRegistry &registry =
*llvm::PassRegistry::getPassRegistry();
                llvm::initializeCore(registry);
                llvm::initializeScalarOpts(registry);
                llvm::initializeObjCARCOpts(registry);
                llvm::initializeVectorization(registry);
                llvm::initializeIPO(registry);
                llvm::initializeAnalysis(registry);
                llvm::initializeIPA(registry);
                llvm::initializeTransformUtils(registry);
                llvm::initializeInstCombine(registry);
                llvm::initializeTarget(registry);
                llvm::initializeCodeGen(registry);
                llvm::initializeLoopStrengthReducePass(registry);
                llvm::initializeLowerIntrinsicsPass(registry);
                llvm::initializeUnreachableBlockElimPass(registry);

                llvm::TargetOptions opt;
                opt.PositionIndependentExecutable = false;

                const std::string& triple = llvm::sys::getProcessTriple();
                const std::string& hostcpu = llvm::sys::getHostCPUName();
                const std::string& features = "";
                std::string error;
                const llvm::Target *const target =
llvm::TargetRegistry::lookupTarget(triple, error);

                llvm::TargetMachine *const tm = target->createTargetMachine(
                    triple, hostcpu, features, opt,
                    llvm::Reloc::Default,
                    llvm::CodeModel::JITDefault,
                    llvm::CodeGenOpt::Aggressive);

                // Set up IR pass management
                llvm::FunctionPassManager fpm(module);
                llvm::PassManager pm;

                tm->addAnalysisPasses(pm);
                tm->addAnalysisPasses(fpm);

                // Use a pass manager builder for C-style optimisations
                llvm::PassManagerBuilder passBuilder;
                passBuilder.OptLevel = 3;
                passBuilder.SizeLevel = 0;
                passBuilder.Inliner =
llvm::createAlwaysInlinerPass(false); // suppress llvm.lifetime.*
intrinsics
                passBuilder.BBVectorize = true;
                passBuilder.SLPVectorize = true;
                passBuilder.LoopVectorize = true;
                passBuilder.LateVectorize = true;

                passBuilder.populateFunctionPassManager(fpm);
                passBuilder.populateModulePassManager(pm);

                fpm.doInitialization();
                for (llvm::Module::iterator it = module->begin(),
endit = module->end(); it != endit; ++it) {
                    fpm.run(*it);
                }
                fpm.doFinalization();
                pm.run(*module);

                execEngine =
llvm::EngineBuilder(module).setEngineKind(llvm::EngineKind::JIT).setUseMCJIT(true).create(tm);
                execEngine->finalizeObject();
...

I guess there's something apparent I'm missing from the MCJIT setup in
order to get these results. Any hits are greatly appreciated.

Regards,
Martin
gamma_chen | 26 Feb 05:59 2015
Picon

llvm.eh.return?

Can someone tell me what the C/C++ statement can generate llvm ir (intrinsic) <at> llvm.eh.return.i32 by clang? Or any statement of high level language can get <at> llvm.eh.return.i32.

Jonathan
 
_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Francois Pichet | 26 Feb 05:18 2015
Picon

Re: PSA: clang-cl can self host now!



On Wed, Feb 25, 2015 at 11:12 PM, Zachary Turner <zturner <at> google.com> wrote:
I believe this may actually be possible, but honestly I haven't tested it.

But maybe you can be a guinea pig :)  If you've installed the LLVM toolchain (so that it shows up as an option in Visual Studio's project settings), then you can run cmake -G "Visual Studio 12" -T LLVM-vs2013



OK I didn't installed the VS LLVM toolchain, that's why it was still using cl.exe. But it works with ninja!


_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Ahmed Bougacha | 26 Feb 01:57 2015
Picon

[RFC] AArch64: Should we disable GlobalMerge?

Hi all,

I've started looking at the GlobalMerge pass, enabled by default on
ARM and AArch64.  I think we should reconsider that, at least for
AArch64.

As is, the pass just merges all globals together, in groups of 4KB
(AArch64, 128B on ARM).

At the time it was enabled, the general thinking was "it's almost
free, it doesn't affect performance much, we might as well use it".
Now, it's preventing some link-time optimizations (as acknowledged in
one of the FIXMEs).

-- Performance impact
Overall, it isn't that profitable on the test-suite, and actually
degrades performance on a lot of other - "non-benchmark" - projects I
tried (where the main reason to use a global is file- or function-
static variables, only accessed through a single getter function).

Across several runs on the entire test-suite, when disabling the pass,
I measured:
without LTO, a -0.19% geomean improvement
with LTO, a +0.11% geomean regression.

As for just SPEC2006, there are two big regressions: 400.perlbench
(10.6% w/ LTO, 2.7% w/o) and 471.omnetpp (2.3% w/, 3.9% w/o).

Numbers are attached.

-- A way forward
One obvious way to improve it is: look at uses of globals, and try to
form sets of globals commonly used together.  The tricky part is to
define heuristics for "commonly".  Also, the pass then becomes much
more expensive.  I'm currently looking into improving it, and will
report if I come up with a good solution.  But this shouldn't stop us
from disabling it, for now.

Also, the pass seems like a good candidate for
-O3/CodeGenOpt::Aggressive.  However, the latter is implied by LTO,
which IMO shouldn't include these not-always-profitable optimizations.
That's another problem though.

Right now, I think we should disable the pass by default, until it's
deemed profitable enough.

-Ahmed
Shankar Easwaran | 25 Feb 18:40 2015

[lld][PECOFF] assert from lld once in 5 test runs.

Hi Rui,

Not sure if you have seen this problem, but I have been running into 
this problem when I run the lld tests and the failure occurence is once 
in 5 times.

lld: ../tools/lld/lib/Core/Resolver.cpp:402: void 
lld::Resolver::deadStripOptimize(): Assertion `symAtom' failed.
#0 0x4b05ae llvm::sys::PrintStackTrace(_IO_FILE*) 
/usr2/seaswara/work/llvmorg/llvm-build/../lib/Support/Unix/Signals.inc:422:15
#1 0x4b136b PrintStackTraceSignalHandler(void*) 
/usr2/seaswara/work/llvmorg/llvm-build/../lib/Support/Unix/Signals.inc:481:1
#2 0x4b2ee4 SignalHandler(int) 
/usr2/seaswara/work/llvmorg/llvm-build/../lib/Support/Unix/Signals.inc:198:60
#3 0x7fd12664bcb0 __restore_rt 
(/lib/x86_64-linux-gnu/libpthread.so.0+0xfcb0)
#4 0x7fd12587d0d5 gsignal 
/build/buildd/eglibc-2.15/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:64:0
#5 0x7fd12588083b abort /build/buildd/eglibc-2.15/stdlib/abort.c:93:0
#6 0x7fd125875d9e __assert_fail_base 
/build/buildd/eglibc-2.15/assert/assert.c:55:0
#7 0x7fd125875e42 (/lib/x86_64-linux-gnu/libc.so.6+0x2ee42)
#8 0x76ff7f lld::Resolver::deadStripOptimize() 
/usr2/seaswara/work/llvmorg/llvm-build/../tools/lld/lib/Core/Resolver.cpp:402:5
#9 0x770be5 lld::Resolver::resolve() 
/usr2/seaswara/work/llvmorg/llvm-build/../tools/lld/lib/Core/Resolver.cpp:481:7
#10 0x4538f4 lld::Driver::link(lld::LinkingContext&, llvm::raw_ostream&) 
/usr2/seaswara/work/llvmorg/llvm-build/../tools/lld/lib/Driver/Driver.cpp:108:8
#11 0x425c25 lld::WinLinkDriver::linkPECOFF(int, char const**, 
llvm::raw_ostream&) 
/usr2/seaswara/work/llvmorg/llvm-build/../tools/lld/lib/Driver/WinLinkDriver.cpp:879:10
#12 0x42238a lld::UniversalDriver::link(int, char const**, 
llvm::raw_ostream&) 
/usr2/seaswara/work/llvmorg/llvm-build/../tools/lld/lib/Driver/UniversalDriver.cpp:207:12
#13 0x421e76 main 
/usr2/seaswara/work/llvmorg/llvm-build/../tools/lld/tools/lld/lld.cpp:35:10
#14 0x7fd12586876d __libc_start_main 
/build/buildd/eglibc-2.15/csu/libc-start.c:258:0
#15 0x421d35 _start 
(/local/mnt/workspace/shankare/work/llvmorg/llvm-build/bin/lld+0x421d35)
Stack dump:
0.      Program arguments: lld

/out:/local/mnt/workspace/shankare/work/llvmorg/llvm-build/tools/lld/test/pecoff/Output/export.test.tmp5.dll 
/dll /entry:init /export:exportfn7 --

/local/mnt/workspace/shankare/work/llvmorg/llvm-build/tools/lld/test/pecoff/Output/export.test.tmp.obj 
-flavor
link
/local/mnt/workspace/shankare/work/llvmorg/llvm-build/tools/lld/test/pecoff/Output/export.test.script: 
line 18: 16942 Aborted                 lld -flavor link

/out:/local/mnt/workspace/shankare/work/llvmorg/llvm-build/tools/lld/test/pecoff/Output/export.test.tmp5.dll 
/dll /entry:init /export:exportfn7 -- 
/local/mnt/workspace/shankare/work/llvmorg/llvm-build/tools/lld/test/pecoff/Output/export.test.tmp.obj

Shankar Easwaran

--

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
Sridhar G | 25 Feb 06:58 2015
Picon

Calculate LoopInfo again

Hello,
I am iterating over loops using LoopInfo iterator. If a condition satisfies on some loop, I change the loop body and remove it from that particular loop. Also, I remove all incoming edges to the loop header, latch and preHeader. Hence, these will be the unreachable blocks in the function.

If that condition satisfies, then I start iterating the loops from the beginning skipping the modified loop. So, I need to re-calculate the LoopInfo.

How to build LoopInfo again with the updated CFG structure?

If I just call getAnalysis<LoopInfo>() again, the modified loop will still be present in it.

--
Regards,
Sridhar
_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Sridhar G | 25 Feb 06:53 2015
Picon

Calculate LoopInfo again

Hello,
I am iterating over loops using LoopInfo iterator. If a condition satisfies on some loop, I change the loop body and remove it from that particular loop. Also, I remove all incoming edges to the loop header, latch and preHeader. Hence, these will be the unreachable blocks in the function.

If that condition satisfies, then I start iterating the loops from the beginning skipping the modified loop. So, I need to re-calculate the LoopInfo.

How to build LoopInfo again with the updated CFG structure?

If I just call getAnalysis<LoopInfo>() again, the modified loop will still be present in it.

--
Regards,
Sridhar
_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Gmane