Age | Commit message (Collapse) | Author |
|
|
|
|
|
Under some situation, the LLVM auto-vectorization will generate
llvm.cttz.v2i64() which does not correspond to any ARM instruction
and result in unmatched selection DAG. This backported patch
workaround the problem by refining the cost model to avoid this
intrinsics in this case.
Patch originally by James Molloy - james molloy arm com
[ARM] Teach the cost model that cross-class copies are costly.
Cross-class copies being expensive is actually a trait of the
microarchitecture, but as I haven't yet seen an example of a
microarchitecture where they're cheap it seems best to just enable this
by default, covering the non-mcpu build case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217674
91177308-0d34-0410-b5e6-96231b3b80d8
|
|
To match gcc behavior and avoid ld warning like this on project using
prebuilts/ndk 's libc++
.../ld: warning: creating a DT_TEXTREL in a shared object.
(*1)
switch_to_exception_section (const char * ARG_UNUSED (fnname))
{
....
if (EH_TABLES_CAN_BE_READ_ONLY)
{
int tt_format =
ASM_PREFERRED_EH_DATA_FORMAT (/*code=*/0, /*global=*/1); # DW_EH_PE_indirect which is 0x80
flags = ((! flag_pic
|| ((tt_format & 0x70) != DW_EH_PE_absptr/*0x00*/
&& (tt_format & 0x70) != DW_EH_PE_aligned/*0x50*/))
? 0 : SECTION_WRITE);
}
else
flags = SECTION_WRITE;
...
}
Note that PIC is the default for Android toolchain. For MIPS,
ASM_PREFERRED_EH_DATA_FORMAT returns 0x80 and fails condition
about "DW_EH_PE_absptr"
The exact reason why .gcc_except_table contain reloc (and causes
ld warning turns in to error thanks to -Wl,--fatal-warnings when it
isn't writable thus requires load-time fixup in read-only section
and security concern) isn't well understood, though
Change-Id: I5e05ee052af48f06e8c4e3a01c317f4010a2ddaf
|
|
Remove dynamic relocations of __gxx_personality_v0 from the .eh_frame.
The MIPS64 follow-up of the MIPS32 fix (rL209907).
Patch by Vladimir Stefanovic.
Differential Revision: http://reviews.llvm.org/D6141
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221408 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Change-Id: Ib572077475a44e387c4b1b5d0ecd688be4e94a4e
|
|
.. to python26 python2 python as some distros have python == python3
Change-Id: I2e1fbb6aa23614abc6df1581097400a31b7badac
|
|
See previous CLs for details about Cortex-A53 erratum (835769)
Flag aarch64-fix-cortex-a53-835769 is set to 1 or 0 by clang,
depending on the presence of -mfix-cortex-a53-835769 or
-mno-fix-cortex-a53-835769. If neither is given,
this CL changes the default to "true" for aarch64-fix-cortex-a53-835769.
It makes sense to have this workaround enabled by default in Android
when apps generated by NDK toolchains can potentially run on affected A53.
Android platform clang can specify -mno-fix-cortex-a53-835769 to disable this
workaround for A53 known to be not affected.
This changes also match the behavior in GCC r216079 when it's configured with
--enable-fix-cortex-a53-835769, which is the case for NDK toolchains.
Change-Id: I56d6c273f79c25d18e2cdd1b61015f35082bbb23
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219684 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is
possible for a 64-bit multiply-accumulate instruction in AArch64 state to
generate an incorrect result. The details are quite complex and hard to
determine statically, since branches in the code may exist in some
circumstances, but all cases end with a memory (load, store, or prefetch)
instruction followed immediately by the multiply-accumulate operation.
The safest work-around for this issue is to make the compiler avoid emitting
multiply-accumulate instructions immediately after memory instructions and the
simplest way to do this is to insert a NOP.
This patch implements such work-around in the backend, enabled via the option
-aarch64-fix-cortex-a53-835769.
The work-around code generation is not enabled by default.
Change-Id: I3a748b7f4cf38d43e0702057a5bd41eab5c47c4d
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219603 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This assertion is valid on single bitcode codegen.
However, if we uses llvm-link many bitcodes, then llc.
It may bump this assertion failure but everything should be fine.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
So $NDK/build/tools/build-llvm.sh can add -DDISABLE_FUTIMENS
to remove dependencies of llvm binraries to futimens@GLIBC_2.6
doesn't exist in libc.so of some old linux systems.
|
|
Change-Id: Ie055ec1890350cd65cd496fe6416031a1248efb0
|
|
Add an option --enable-shrink-binary-size to reduce the cross-
compiled LLVM binaries size.
This patch will pass -fdata-sections and -ffunction-sections to $(CXX)
and --gc-sections to the $(LD) when --enable-shrink-binary-size is
enabled. This option is disabled by default; otherwise, the LLVM tools
might not be able to load plugins (eg. LLVMPolly.so)
This is ported from release_33 and release_34 branch, and squashing
following patches:
[1bb7be3] (release_33) WenHan Gu <Wenhan.gu@mediatek.com>
Shrink binary sizes when cross-compiling.
[6230037] (release_33) Ray Donnelly <mingw.android@gmail.com>
Fixes for "Shrink binary sizes when cross-compiling."
[43153c9] (release_33) Ray Donnelly <mingw.android@gmail.com>
Fix config comment about -Wl,--gc-sections.
[d44086b] (release_34) Lai Wei-Chih <Robert.Lai@mediatek.com>
Add an option --enable-shrink-binary-size to configure.
|
|
1. Add new two new flags: CFLAGS_FOR_BUILD and LDFLAGS_FOR_BUILD
and use them in $BUILD_CC test.
2. Unset $LDFLAGS in cross-compile-build-tools.
3. $ARCH can be overrided by environment variable.
Cherry-picked from release_34 branch.
Regenerated with ./autoconf/AutoRegen.sh.
|
|
|
|
Author: Lai Wei-Chih <Robert.Lai@mediatek.com>
Author: WenHan Gu <Wenhan.gu@mediatek.com>
Author: Andrew Hsieh <andrewhsieh@google.com>
|
|
|
|
|
|
Author: Lai Wei-Chih <Robert.Lai@mediatek.com>;
Author: WenHan Gu <Wenhan.gu@mediatek.com>;
Author: Logan Chien <loganchien@google.com>;
|
|
* kAndroidBitcodeType
Generate this bitcode for shared object or executable.
* kAndroidLDFlags
Store linker flags for this bitcode.
|
|
source:
https://android.googlesource.com/platform/frameworks/compile/libbcc/
commit: 25b7205e16e422469da74f88e74ad79e7c284ac7
libbcc/bcinfo
libbcc/include/bcinfo
|
|
Change-Id: I1b09d34632e59997402b034e7e652998ef8a8faa
|
|
Add an option to store the stack protector cookie
in the global variable.
|
|
NDK armeabi default use -msoft-float, and it cannot truely use
fenv.h since it has vmrs,vmsr inline asm.
Change-Id: If49045f2bfed0f33f7dc9156d5eea3bda89ad90a
|
|
The global variable optimization will try to evaluate the
constant global variable and remove the constructors.
However, some legacy code will try to perform const_cast<>()
and assign to the "constant" to avoid static initialization
order fiasco. To workaround those old code, this commit
adds a new option to disable the promotion of the global
constant with constructors.
|
|
------------------------------------------------------------------------
r217490 | delcypher | 2014-09-10 12:09:23 +0100 (Wed, 10 Sep 2014) | 4 lines
Don't attempt to run llvm-config in cmake/modules/Makefile when doing
``make clean`` because it won't be available.
This is an attempt to unbreak buildbots broken by r217484.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@217640 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r217484 | delcypher | 2014-09-10 11:18:59 +0100 (Wed, 10 Sep 2014) | 13 lines
Attempt to fix PR20884
This fixes the generation of broken LLVMExports.cmake file by
the Autoconf/Makefile build system when --enable-shared is passed to
configure.
When --enable_shared is passed the Makefile.rules does not set the
LLVMConfigLibs variable which cmake/modules/Makefile previously relied
on. Now it runs the llvm-config command itself to get the library names.
This still isn't perfect because the generated LLVM targets refer to the
static libraries and not the shared library but that is much larger
problem to fix.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@217638 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@217304 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216951 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216950 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216762 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216760 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r216064 | kongyi | 2014-08-20 03:40:20 -0700 (Wed, 20 Aug 2014) | 9 lines
ARM: Fix codegen for rbit intrinsic
LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic.
According to ARM ARM, rbit only takes register as argument, not immediate.
The correct instruction should be rbit <Rd>, <Rm>.
The bug was originally introduced in r211057.
Differential Revision: http://reviews.llvm.org/D4980
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216089 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r215711 | wschmidt | 2014-08-15 06:51:57 -0700 (Fri, 15 Aug 2014) | 8 lines
[PPC64] Add test case for r215685.
I had deferred adding this test case until I could get it down to a
reasonable size. That's done now.
Thanks,
Bill
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215879 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r215685 | wschmidt | 2014-08-14 18:25:26 -0700 (Thu, 14 Aug 2014) | 69 lines
[PPC64] Add missing dependency on X2 to LDinto_toc.
The LDinto_toc pattern has been part of 64-bit PowerPC for a long
time, and represents loading from a memory location into the TOC
register (X2). However, this pattern doesn't explicitly record that
it modifies that register. This patch adds the missing dependency.
It was very surprising to me that this has never shown up as a problem
in the past, and that we only saw this problem recently in a single
scenario when building a self-hosted clang. It turns out that in most
cases we have another dependency present that keeps the LDinto_toc
instruction tied in place. LDinto_toc is used for TOC restore
following a call site, so this is a typical sequence:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1
ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there
is a natural anti-dependency between the two that keeps it in place.
Therefore we don't usually see a problem. However, in one particular
case, one call is followed immediately by another call, and the second
call requires a parameter that is a TOC-relative address. This is the
code sequence:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1
ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use>
%vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
%vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39
Note that the back-to-back stack adjustments are the same size! The
back end is smart enough to recognize this and optimize them away:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1
%vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
%vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39
Now there is nothing to prevent the ADDIStocHA instruction from moving
ahead of the LDinto_toc instruction, and because of the longest-path
heuristic, this is what happens.
With the accompanying patch, %X2 is represented as an implicit def:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1, %X2<imp-def,dead>
ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
%vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
%vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39
So now when the two stack adjustments are removed, ADDIStocHA is
prevented from being moved above LDinto_toc.
I have not yet created a test case for this, because the original
failure occurs on a relatively large function that needs reduction.
However, this is a fairly serious bug, despite its infrequency, and I
wanted to get this patch onto the list as soon as possible so that it
can be considered for a 3.5 backport. I'll work on whittling down a
test case.
Have we missed the boat for 3.5 at this point?
Thanks,
Bill
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215878 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215874 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r214679 | chandlerc | 2014-08-03 17:54:28 -0700 (Sun, 03 Aug 2014) | 10 lines
[x86] Fix the test case added in r214670 and tweaked in r214674 further.
Fundamentally, there isn't a really portable way to test the constant
pool contents. Instead, pin this test to the bare-metal triple. This
also makes it a 64-bit triple which allows us to only match a single
constant pool rather than two. It can also just hard code the '.' prefix
as the format should be stable now that it has a fixed triple. Finally,
I've switched it to use CHECK-NEXT to be more precise in the instruction
sequence expected and to use variables rather than hard coding decisions
by the register allocator.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215430 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r214674 | spatel | 2014-08-03 16:20:16 -0700 (Sun, 03 Aug 2014) | 2 lines
Account for possible leading '.' in label string.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215429 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r214670 | spatel | 2014-08-03 15:48:23 -0700 (Sun, 03 Aug 2014) | 8 lines
fix for PR20354 - Miscompile of fabs due to vectorization
This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation.
This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon.
There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215428 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215426 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215090 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r214481 | hfinkel | 2014-07-31 22:20:41 -0700 (Thu, 31 Jul 2014) | 38 lines
[PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads
Altivec vector loads on PowerPC have an interesting property: They always load
from an aligned address (by rounding down the address actually provided if
necessary). In order to generate an actual unaligned load, you can generate two
load instructions, one with the original address, one offset by one vector
length, and use a special permutation to extract the bytes desired.
When this was originally implemented, I generated these two loads using regular
ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with
this:
The alignment of a load does not contribute to its identity, and SDNodes
are uniqued. So, imagine that we have some unaligned load, L1, that is not
aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned).
Further imagine that there had already existed a load (L1+16)(unaligned) with
the same chain operand as the load L1. When (L1+16)(aligned) is created as part
of the lowering of L1, this load *is* also the (L1+16)(unaligned) node, just
now marked as aligned (because the new alignment overwrites the old). But the
original users of (L1+16)(unaligned) now get the data intended for the
permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists
to get its own permutation-based expansion. This was PR19991.
A second potential problem has to do with the MMOs on these loads, which can be
used by AA during instruction scheduling to break chain-based dependencies. If
the new "aligned" loads get the MMO from the original unaligned load, this does
not represent the fact that it will load data from below the original address.
Normally, this would not matter, but this load might be combined with another
load pair for a previous vector, and then the dependency on the otherwise-
ignored lower bytes can matter.
To fix both problems, instead of generating the necessary loads using regular
ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are
provided with MMOs with a conservative address range.
Unfortunately, I no longer have a failing test case (since PR19991 was
reported, other changes in CodeGen have forced this bug back into hiding it
again). Nevertheless, this should fix the underlying problem.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215058 91177308-0d34-0410-b5e6-96231b3b80d8
|