jsrduck | 29 May 01:52 2015
Picon

Passing ConstantDataArray to GetElementPtrInst

Hi, I'm having a hard time finding an answer to what I assume is a very basic
idea. I'm trying to produce this llvm code using the LLVM api: 

  %myString = alloca [13 x i8], i32 13 
  store [13 x i8] c"Hello world.\00", [13 x i8]* %myString 
  %tmp1 = getelementptr [13 x i8]* %myString, i32 0, i32 0 
  %tmp2 = call i32 (i8*, ...)*  <at> printf( i8* %tmp1 ) nounwind 

A simple Hello World example. But I can't seem to figure out what I'm
supposed to pass to the GetElementPtrInst. For example, this snippet: 

   std::vector<llvm::Value*> args; 
   auto strValue =
llvm::ConstantDataArray::getString(llvm::getGlobalContext(),
llvm::StringRef("Hello world.")); 
   std::vector<llvm::Value*> index_vector; 
   index_vector.push_back(Builder.getInt32(0)); 
   index_vector.push_back(Builder.getInt32(0)); 
   auto valueAsPtr = Builder.CreateGEP(strValue, index_vector, "tmp1"); 
   args.push_back(valueAsPtr); 
   Builder.CreateCall(TheModule->getFunction("printf"), args, "callprintf"); 

Note that this is summarized, I already have the IRBuilder, etc initialized.
I've also tried this with a LoadInstr that loads the ConstantDataArray (I
just inserted strValue into the example code to simplify things, I get the
same error either way). 

If I run this code, LLVM asserts in the CreateGEP call with the error
message "Non-pointer type for constant GetElementPtr expression" since the
type I passed to CreateGEP is an Array type. Callstack: 
(Continue reading)

Matthias Braun | 29 May 02:05 2015
Picon

Pristine Register Semantics

This is just a short heads-up for out-of-tree targets relying on the concept of pristine registers. I slightly changed their semantics in r238524 (see http://reviews.llvm.org/D10101 for details). If you happened to not use the RegisterScavenger to get liveness past the PrologueEpilogue inserted and relied on MachineFrameInfo::getPristineRegs() without at same time relying on the machine basic blocks live-in sets then you may be affected by it.

- Matthias
_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Teresa Johnson | 28 May 23:10 2015
Picon

Updated RFC: ThinLTO Implementation Plan

As promised, here is an new version of the ThinLTO RFC, updated based
on some of the comments, questions and feedback from the first RFC.
Hopefully we have addressed many of these, and as noted below, will
fork some of the detailed discussion on particular aspects into
separate design doc threads. Please send any additional feedback and
questions on the overall design.
Thanks!
Teresa

Updated RFC to discuss plans for implementing ThinLTO upstream,
reflecting feedback and discussion from initial RFC
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-May/085557.html). As
discussed in the earlier thread and below, more detailed design
documents for several pieces (native object format, linkage type
changes and static promotions, etc) are in progress and will be sent
separately. This RFC covers the overall design and the breakdown of
work at a higher level.

Background on ThinLTO can be found in slides from EuroLLVM 2015:
  https://drive.google.com/open?id=0B036uwnWM6RWWER1ZEl5SUNENjQ&authuser=0
As described in the talk, we have a prototype implementation, and
would like to start staging patches upstream. This RFC describes a
breakdown of the major pieces. We would like to commit upstream
gradually in several stages, with all functionality off by default.
The core ThinLTO importing support and tuning will require frequent
change and iteration during testing and tuning, and for that part we
would like to commit rapidly (off by default). See the proposed staged
implementation described in the Implementation Plan section.

ThinLTO Overview
==================

See the talk slides linked above for more details. The following is a
high-level overview of the motivation.

Cross Module Optimization (CMO) is an effective means for improving
runtime performance, by extending the scope of optimizations across
source module boundaries. Without CMO, the compiler is limited to
optimizing within the scope of single source modules. Two solutions
for enabling CMO are Link-Time Optimization (LTO), which is currently
supported in LLVM and GCC, and Lightweight-Interprocedural
Optimization (LIPO). However, each of these solutions has limitations
that prevent it from being enabled by default. ThinLTO is a new
approach that attempts to address these limitations, with a goal of
being enabled more broadly. ThinLTO is designed with many of the same
principals as LIPO, and therefore its advantages, without any of its
inherent weakness. Unlike in LIPO where the module group decision is
made at profile training runtime, ThinLTO makes the decision at
compile time, but in a lazy mode that facilitates large scale
parallelism. LTO implementations all contain a serial IPA/IPO step
that is both memory intensive and slow, limiting usability on both
smaller workstations and huge applications. In contrast, the ThinLTO
serial linker plugin phase is designed to be razor thin and blazingly
fast. By default this step only does minimal preparation work to
enable the parallel lazy importing performed later. ThinLTO aims to be
scalable like a regular O2 build, enabling CMO on machines without
large memory configurations, while also integrating well with
distributed build systems. Results from early prototyping on SPEC
cpu2006 C++ benchmarks are in line with expectations that ThinLTO can
scale like O2 while enabling much of the CMO performed during a full
LTO build.

A ThinLTO build is divided into 3 phases, which are referred to in the
following implementation plan:
1. phase-1: IR and Function Summary Generation (-c compile)
2. phase-2: Thin Linker Plugin Layer (thin archive linker step)
3. phase-3: Parallel Backend with Demand-Driven Importing

Implementation Plan
====================

This section gives a high-level breakdown of the ThinLTO support that
will be added, in roughly the order that the patches would be staged.
The patches are divided into three stages. The first stage contains a
minimal amount of preparation work that is not ThinLTO-specific. The
second stage contains most of the infrastructure for ThinLTO, which
will be off by default. The third stage includes
enhancements/improvements/tunings that can be performed after the main
ThinLTO infrastructure is in.

The second and third implementation stages will initially be very
volatile, requiring a lot of iterations and tuning with large apps to
get stabilized. Therefore it will be important to do fast commits for
these implementation stages.

1. Stage 1: Preparation
------------------------------------

The first planned sets of patches are enablers for ThinLTO work:

a. LTO directory structure

Restructure the LTO directory to remove circular dependence when
ThinLTO pass added. Because ThinLTO is being implemented as a SCC pass
within Transforms/IPO, and leverages the LTOModule class for linking
in functions from modules, IPO then requires the LTO library. This
creates a circular dependence between LTO and IPO. To break that, we
need to split the lib/LTO directory/library into lib/LTO/CodeGen and
lib/LTO/Module, containing LTOCodeGenerator and LTOModule,
respectively. Only LTOCodeGenerator has a dependence on IPO, removing
the circular dependence.

Note that libLTO and llvm-lto use LTOModule/LTOCodeGenerator, whereas
the gold plugin uses lib/Object/IRObject and lib/Linker directly. The
use of LTOModule in the ThinLTO pass is a convenience, but could be
avoided by using the IRObject/Linker methods directly if that is
preferred.

b. Native object wrapper generation support

Implement native-object wrapped bitcode writer. The main goal is to
more easily interact with existing native tools such as $AR, $NM, “$LD
-r”, $OBJCOPY, and $RANLIB, without requiring the build system to find
and pass the plugin as an option. We plan to emit the phase-1 bitcode
wrapped in native object format via the .llvmbc section, along with a
symbol table. We will implement ELF first, but subsequently extend
support to COFF and Mach-O. Additionally, we also want to avoid doing
partial LTO/ThinLTO across files linked with “$LD -r” (i.e. the
resulting object file should still contain native object-wrapped
bitcode to enable ThinLTO at the full link step). I will send a
separate design document for these changes, including the format of
the symtab and function index/summary section, but the following is a
high-level motivation and overview.

Note that support for ThinLTO using bitcode can be added as a
follow-on under an option, so that bitcode-aware tools do not need to
use the wrapper. Under the bitcode-only option, the symbol table will
be replaced by the bitcode form of the function index and summary
section, which can be encoded as a new bitcode block type. Changes
should be made to the gold plugin to avoid partial link of bitcode
files under “$LD -r” (emitting bitcode rather than compiling all the
way down to native code, which is how ld64 behaves on Darwin as per
dexonsmith).

Advantages of using native object format:
* Out of the box interoperability with existing native build tools
($AR, $NM, “$LD -r”, $OBJCOPY, and $RANLIB) which may not currently
know how to locate/pass the appropriate plugin.
* There is precedence in using this format: other compilers also wrap
intermediate LTO files (probably related to the above advantage)[1].
* Tools that modify symbol linkage and visibility (e.g. $OBJCOPY and
“$LD -r”) can mark the change in the symbol table without needing to
parse/change/encode bitcode. The change can be propagated to bitcode
by the ThinLTO backend.
* Some tools only need to read/write the symtab and can avoid
parsing/encoding bitcode (e.g. $NM, $OBJCOPY).
* The second phase of ThinLTO does not need to parse the bitcode when
creating the combined function index.

Disadvantages of using native object format:
* Unnecessary when using plugins with plugin-aware native tools, or
LLVM’s custom tools.
* Slightly increase disk storage and I/O from symtab. However, with
our design the symtab is leveraged to hold function indexing info
required for ThinLTO. The I/O for some build tools and build steps can
actually be reduced as there is no need to read the bitcode, as
described above.

Support was added to LLVM for reading native object-wrapped bitcode
(http://reviews.llvm.org/rL218078), but there does not yet exist
support in LLVM/Clang for emitting bitcode wrapped in native object
format. I plan to add support for optionally generating bitcode in an
native object file containing a single .llvmbc section holding the
bitcode. Specifically, the patch would add new options
“emit-llvm-native-object” (object file) and corresponding
“emit-llvm-native-assembly” (textual assembly code equivalent).
Eventually these would be automatically triggered under “-fthinlto -c”
and “-fthinlto -S”, respectively.

Additionally, a symbol table will be generated in the native object
file, holding the function symbols within the bitcode. This
facilitates handling archives of the native object-wrapped bitcode
created with $AR, since the archive will have a symbol table as well.
The archive symbol table enables gold to extract and pass to the
plugin the constituent native object-wrapped bitcode files. To support
the concatenated llvmbc section generated by “$LD -r”, some handling
needs to be added to gold and to the backend driver to process each
original module’s bitcode.

The function index/summary will later be added as a special native
object section alongside the .llvmbc sections. The offset and size of
the corresponding function summary can be placed in the associated
symtab entry. As noted above, a separate design document will be sent
for the native object format changes.

2. Stage 2: ThinLTO Infrastructure
------------------------------------------------------

The next set of patches adds the base implementation of the ThinLTO
infrastructure, specifically those required to make ThinLTO functional
and generate correct but not necessarily high-performing binaries.

a. Clang/LLVM/gold linker options

An early set of clang/llvm patches is needed to provide options to
enable ThinLTO (off by default), so that the rest of the
implementation can be disabled by default as it is added.
Specifically, clang options -fthinlto (used instead of -flto) will
cause clang to invoke the phase-1 emission of LLVM bitcode and
function summary/index on a compile step, and pass the appropriate
option to the gold plugin on a link step. The -thinlto option will be
added to the gold plugin and llvm-lto tool to launch the phase-2 thin
archive step. The -thinlto-be option will also be added to clang to
invoke it as a phase-3 parallel backend instance with a bitcode file
as input.

b. Thin-archive linking support in Gold plugin and llvm-lto

Under the new plugin option (see above), the plugin needs to perform
the phase-2 (thin archive) link which simply emits a combined function
index from the linked modules, without actually performing the normal
link. Corresponding support should be added to the standalone llvm-lto
tool to enable testing/debugging without involving the linker and
plugin.

c. ThinLTO backend support

Support for invoking a phase-3 backend invocation (including
importing) on a module should be added to the clang driver under the
new option. The main change under the option is to instantiate a
Linker object used to manage the process of linking imported functions
into the module, efficient read of the combined function index, and
enable the ThinLTO import pass.

d. Function index/summary support

This includes infrastructure for writing and reading the function
index/summary section. As noted earlier this will be encoded in a
special section within the native object file for the module,
alongside the .llvmbc section containing the bitcode. The thin archive
(combined function index) generated by phase-2 of ThinLTO simply
contains all of the function index/summary sections across the linked
modules, organized for efficient function lookup. As mentioned earlier
when discussing the native object wrapper format, a separate design
document will be sent for this format.

Each function available for importing from the module contains an
entry in the module’s function index/summary section and in the
resulting combined function index. Each function entry contains that
function’s offset within the bitcode file, used to efficiently locate
and quickly import just that function (see below in 2e for more
details on the importing mechanics). The entry also contains summary
information (e.g. basic information determined during parsing such as
the number of instructions in the function), that will be used to help
guide later import decisions. Because the contents of this section
will change frequently during ThinLTO tuning, it should also be marked
with a version id for backwards compatibility or version checking.

e. ThinLTO importing support

Support for the mechanics of importing functions from other modules,
which can go in gradually as a set of patches since it will be off by
default (the ThinLTO pass itself discussed below in 2f).

Note that ThinLTO function importing is iterative, and we may import
from a number of modules in an interleaved fashion. For example,
assume we have hot call chains a()->b1()->c() and a()->b2()->d(),
where functions a(), b1()/b2(), c() and d() are from modules A, B, C
and D, respectively. When performing ThinLTO backend compilation of
module A, we may decide to import in the following order (based on
callsite and function summary info):
1. B::b1()  # exposes call to c()
2. C::c()
3. B::b2()  # exposes call to d()
4. D::d()
For this reason, ThinLTO importing is different than regular LTO
bitcode reading and linking, which reads and links in a module in its
entirety on a single pass through each module (notice in the above
example the imports of the two module B functions have an intervening
import from module C). As a result, for example, the existing support
for lazy metadata parsing that delays it until the first function is
materialized can’t be leveraged (metadata handling is discussed more
below in 2h). Therefore, the ThinLTO importing pass instantiates a new
BitcodeReader and LTOModule object for each function we decide to
import, parsing only what is needed and linking in just that function.
This is fast and efficient as found in the prototype results shown in
the linked EuroLLVM slides.

Separate patches can include:

* BitcodeReader changes to use function index to import/deserialize
single function of interest (small changes, leverages existing lazy
function streamer support). The declarations and other symbol table
info in the bitcode must be reloaded, but the bitcode parsing can stop
once the first function body is hit. We simply set up an entry in the
lazy streamer’s DeferredFunctionInfo function index map from the
bitcode index that was saved in the ThinLTO function summary (and
therefore don’t need to build up this function index structure through
repeated calls to RememberAndSkipFunctionBody via
FindFunctionInStream).
* Minor LTOModule changes to pass the ThinLTO function to import and
its index into bitcode reader (see 1a for discussion on LTOModule
use).
* Marking of imported functions. Most handling for ThinLTO imported
functions will simply rely on applying the appropriate linkage type.
But it is useful to know which functions were imported, both for
compiler debugging and and verification, and possibly to modify some
optimization heuristics along with the summary information. This can
be in-memory initially, but IR support may be required in order to
support streaming bitcode out and back in again after importing.
* ModuleLinker changes to do ThinLTO-specific symbol linking and
static promotion when necessary. The linkage type of imported
non-local functions and variables changes to
AvailableExternallyLinkage, for example. Statics must be promoted in
certain cases, and accordingly renamed in consistent ways. Read-write
or address-taken static variables must always be promoted. Other
discardable functions, i.e. link-once such as comdats, will be force
imported on reference by another imported function. We are working on
a separate design document describing these changes in more detail
with examples, as a more detailed discussion of these changes is
beyond the scope of this RFC.
* GlobalDCE changes to support removing imported non-local functions
that were not inlined and imported non-local variables, which are
marked AvailableExternallyLinkage (very small changes to existing pass
logic). As discussed in the original RFC threads, currently GlobalDCE
does not remove referenced AvailableExternallyLinkage functions.
Instead, these are suppressed later during code generation. It isn’t
clear that these functions are useful past the first call to
GlobalDCE, which is after inlining, GlobalOpt and IPSCCP (so
presumably after inter procedural constant prop, etc). Patch with
these changes in testing as discussed in this thread:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-May/085807.html.

f. ThinLTO Import Driver SCC pass

Adds Transforms/IPO/ThinLTO.cpp with framework for doing ThinLTO via
an SCC pass, enabled only under the -fthinlto-be option. The pass
includes utilizing the thin archive[2] (combined global function
index/summary), import decision heuristics, invocation of
LTOModule/ModuleLinker routines that perform the import, and any
necessary callgraph updates and verification.

g. Backend Driver

For a single node build, the gold plugin will initially exec the
backend processes directly, with the amount of parallelism controlled
via an option and/or env variable. It is also possible to leverage
existing single node build system task dispatching mechanisms such as
Unix Makefiles, Ninja, etc., where the plugin can simply write a build
file and fork the parallel backend instances directly under an
appropriate option. We will also initially add support for our
distributed build system as described below under 3c.

h. Lazy Debug Metadata Linking

The prototype implementation included lazy importing of module-level
metadata during the ThinLTO pass finalization (i.e. after all function
importing is complete). This actually applies to all module-level
metadata, not just debug, although it is the largest. This can be
added as a separate set of patches, and the detailed design will be
sent with those. Includes changes to BitcodeReader, ValueMapper, and
the ModuleLinker classes. As described in 2e, due to the
iterative/interleaved nature of ThinLTO importing, the bitcode parsing
is structured differently than LTO where a single pass over each
module can be performed to parse and materialize all functions and
metadata. Therefore, the lazy metadata parsing support in
BitcodeReader, which parses all the metadata once the first function
is materialized, are not applicable. We may instantiate a
BitcodeReader multiple times for a module, if multiple functions are
eventually imported, and we need a way to suture up the metadata to
the functions imported by an earlier BitcodeReader instantiation. The
high level summary is that during the initial import we leave the
temporary metadata on the instructions that were imported, but save
the index used by the bitcode reader used to correlate with the
metadata when it is ready (i.e. the MDValuePtrs index), and skip the
metadata parsing. During the ThinLTO pass finalization we parse just
the metadata, and suture it up during metadata value mapping using the
saved index. As mentioned earlier, this will be described in more
detail when the patches are ready.

3. Stage 3: ThinLTO Tuning and Enhancements
-------------------------------------------------------------------------

This refers to the patches that are not required for ThinLTO to work,
but rather to improve compile time, memory, run-time performance and
usability.

a. Import Tuning

Tuning the import strategy will be an iterative process that will
continue to be refined over time. It involves several different types
of changes: adding support for recording additional metrics in the
function summary, such as profile data and optional heavier-weight IPA
analyses, and tuning the import heuristics based on the summary and
callsite context.

b. Combined Function Index Pruning

The combined function index can be pruned of functions that are
unlikely to benefit from being imported. For example, during the
phase-2 thin archive plug step we can safely omit large and (with
profile data) cold functions, which are unlikely to benefit from being
inlined. Additionally, all but one copy of comdat functions can be
suppressed.

c. Distributed Build System Integration

For a distributed build system such as Bazel (http://bazel.io/), the
gold plugin should write the parallel backend invocations into a build
file, including the mapping from the IR file to the real object file
path, and exit. Additional work needs to be done in the distributed
build system itself to distribute and dispatch the parallel backend
jobs to the build cluster.

d. Dependence Tracking and Incremental Compiles

In order to support build systems that stage from local disks or
network storage, the plugin will optionally support computation of
dependent sets of IR files that each module may import from. This can
be computed from profile data, if it exists, or from the symbol table
and heuristics if not. These dependence sets also enable support for
incremental backend compiles.

________________
[1] The following compilers currently wrap intermediate LTO files in
native object format: GCC fat and non-fat objects (with a custom
symtab), Intel icc non-fat (IR-only) objects (with a full native
symtab), HP’s aCC non-fat objects (with full native symtab), IBM xlC
both fat and non-fat objects (with full native symtab).
[2] The “thin archive” here (also referred to as a combined function
index) has some similarities to the AR tool thin archive format, but
is not exactly the same. Both contain the symtab and not the code, but
the ThinLTO combined function index contains the summary sections as
well.

--

-- 
Teresa Johnson | Software Engineer | tejohnson <at> google.com | 408-460-2413

_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Christian Convey | 28 May 20:57 2015
Picon

Building poolalloc with current LLVM development branch?

I'm starting to play around with poolalloc for the first time.  I tried to build it as follows:
1) Clone llvm's git repo, and build it with cmake + ninja.
2) cd .../llvm/projects
3) git clone git <at> github.com:llvm-mirror/poolalloc.git

When I went to re-run ninja, (or even CMake by itself), I get errors like the following (truncated for brevity).  Any suggestions for whether I'm trying to use the build system incorrectly, vs. having an incompatible combo of LLVM and poolalloc?

CMake Error at cmake/modules/LLVMProcessSources.cmake:83 (message):
  Found unknown source file
  /opt/clang-llvm/llvm/projects/poolalloc/runtime/PreRT/strdup.c


  Please update
  /opt/clang-llvm/llvm/projects/poolalloc/runtime/PreRT/CMakeLists.txt


Call Stack (most recent call first):
  cmake/modules/LLVMProcessSources.cmake:54 (llvm_check_source_file_list)
  cmake/modules/AddLLVM.cmake:263 (llvm_process_sources)
  cmake/modules/AddLLVM.cmake:418 (llvm_add_library)
  projects/poolalloc/runtime/PreRT/CMakeLists.txt:2 (add_llvm_library)



Thanks,
Christian



_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Yuanfang Chen | 28 May 21:27 2015
Picon

PGO for macro expansion code

#define GET_BIT(lll)    \
    // blah blah

#define G(label1,label2)   \
{                                      \
  // decent amount code  \
  ...
  while (1) {                       \
  GET_BIT(label2);             \
  };                                    \
}

void f() {
   if (..)
     G('c', 'd');

   while ( .. )
       G('a','b');
}

After perf sampling, a lot of samples that should have landed in G and
GET_BIT is attributed to the two lines that have G expansion.
Discriminator does not work either for this case. However  the last
field of DW_TAG_lexical_block ("Unique ID to identify blocks from a
template function") is unique for all instances of scopes in G and
GET_BIT. Can we use that to get better profile of hot and cold paths
inside macro? If not, any workaround for now? Thank you.

Yuanfang
Robison, Arch | 28 May 20:20 2015
Picon

Naming TBAA nodes in LLVM >=3.6?

In Julia’s use of LLVM, we were naming TBAA nodes with logic like this for LLVM 3.3:

 

static MDNode *tbaa_make_child( const char *name, MDNode *parent, bool isConstant=false )

{

    MDNode *n = mbuilder->createTBAANode(name,parent,isConstant);

    n->setValueName( ValueName::Create(name, name+strlen(name)));

    return n;

}

 

The names would show up in dumps of the IR (e.g. “call F.dump()” from gdb) , which was really handy for debugging.  But with LLVM 3.6’s split of Value and MDNode, the naming facility seems to have disappeared.  Is there a way with LLVM 3.6 or later to name MDNodes in a way that will show up in IR dumps?

 

Arch Robison

_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Jyoti Rajendra Allur | 28 May 19:37 2015

[ARM backend] adding pattern for SMLALBB

Hi James/Tim,
I am trying to add a patterns for SMLALBB

I think these two assembly patterns can be reduced to SMLALBB using tablegen.

1) 
smulbb  r2, r3, r2
adds    r0, r2, r0 (RdLo)
asr     r3, r2, #31
adc     r1, r3, r1 (RdHi)   ==> smlalbb r0, r1, r3, r2 

I have added pattern in def SMLALBB : AMulxyI64< .....                  as below :-
[] modified to ---> [((set GPR:$RdLo, (ARMaddc (set GPR:$Rd, mul GPR:$Rn, GPR:$Rm), GPR:$RdLo)), (set
GPR:$RdHi, (ARMadde (sra GPR:$Rd, (i32 31)), GPR:$RdHi,  CPSR)))]
This results in 'expected identifier in dag init' error.

Obviously something is wrong in this pattern, I have not figured out what that is ?

2)
ldrb r2, [r2]
ldrb r3, [r3]
smlal r0, r1, r3, r2     ==> smlalbb r0, r1, r3, r2 
def : ARMV5MOPat<(smlal GPRnopc:$RdLo, GPRnopc:$RdHi,
                        (sra (shl GPR:$a, (i32 24)), (i32 24)),
                           (sra (shl GPR:$b, (i32 24)), (i32 24))),
                 (SMLALBB GPRnopc:$RdLo, GPRnopc:$RdHi, GPRnopc:$Rn, GPRnopc:Rm)>;

This throws "Variable not defined: 'smlal' 

Any pointers on these would be helpful.

Regards,
Jyoti Allur
Fabian Gruber | 28 May 17:11 2015

MVT (machine value type) for MachineOperand

Hi,

I am writing a small tool to gather some static statistics on machine code using LLVMs MachineInstr, etc., infrastructure.
My current goal is to count the number of bytes of integer and floating point data being processed per iteration of a loop.
Now I've ran into the problem that I can't find any obvious way to get the actual type of a MachineOperand.

After some digging I noticed that the selection DAG has some type information and I had a look at the TableGen records that contain this type info. I started looking into how I could generate some descriptor tables with that information but TableGen is quite a handfull, and I'd like to avoid this if I can.

Did I overlook some easy way to get at type information at the MachineOperand level?

Thanks.

_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Tomas Kalibera | 28 May 16:08 2015
Picon

http://llvm.org/releases/ built?

Hi,

I find the pre-built binaries at http://llvm.org/releases very useful. 
Is there a script available anywhere that does the build, so that I can 
get a binary built in the very same way, but for a platform not 
included? (I would like llvm+clang for FC20, something like 
http://llvm.org/releases/3.6.1/clang+llvm-3.6.1-x86_64-fedora21.tar.xz 
but for FC20)

Thanks!
Tomas
Chris Bieneman | 28 May 17:22 2015
Picon

[RFC] Late May Update: Progress report on CMake build system's ability to replace autoconf

Hi all,

Time for another update on the status of the CMake build system.

Completed:
* Bug 18496 - [cmake] .S assembly files not compiled by cmake in libclang_rt.ARCH
* Bug 22725 - lldb build with cmake fails with "Program error: Invalid parameters entered, -h for help. "
* Update GettingStarted to prefer CMake

Still Outstanding:

* Bug 14109 - CMake build for compiler-rt should use just-built clang
 - No Update: Still some small issues to resolve.

* Bug 19462 - Use the INSTALL(EXPORT ...) to export CMake definitions
 - I commented in the review for this today (http://reviews.llvm.org/D7623). I think the patches I have are
good to land, but Stephen Kelly made some other suggestions in the bug we should consider separately.

* Bug 19875 - libraries and executables need different rpaths
 - No Update: Still outstanding, I don't think this is a blocker.

* Bug 21561 - Update release scripts to use CMake
 - No Update: Still outstanding and blocking removal of autoconf

* Bug 21562 - Add a CMake equivalent for make/platform/clang_darwin.mk in compiler_rt
 - I've looked at this a bit on and off. It is unfortunately a hard problem. I think it would be easier if CMake
had better support for setting clang's sysroot and arch flags. To that end I've filed a bug against CMake (http://www.cmake.org/Bug/view.php?id=15591).

* Bug 21568 - Cannot add rpath
 - No Update: Not a blocker.

Other issues not tracked by bugs:

* FreeBSD seemed to have problems with CMake identifying itself as amd64 causing x86_64 tests to fail
* Migrating buildbots
* We need to make sure libc++ works properly on Darwin
* Put together a “cheat sheet” document for transitioning
 - If you have an autoconf workflow you’d like to see in the cheat sheet please send your commands my way and
I‘ll assemble the cheat sheet.

If there is anything I’m missing please let me know. Thanks,

Thanks,
-Chris
_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Alex | 28 May 16:36 2015
Picon

Generate .debug_info for asm files?

Dear LLVM devs,

I've noticed that LLVM's assembler (and hence clang) doesn't generate debug data when assembling, even when run with "-g".

Users can add their own CFI/line number data with assembler directives. But GDB ignores the line number data if you don't have a DIE in the .debug_info section which identifies the compilation unit. (I'm not an expert on GDB, but this is the conclusion I have drawn from single-stepping through GDB on some test binaries.)

Would you be interested in a patch which makes the assembler generate .debug_info?

Thanks,
Alex Dowad
_______________________________________________
LLVM Developers mailing list
LLVMdev <at> cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Gmane