aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-12-19Merge "[3.5] writable .gcc_except_table for mips[64]" into release_35release_35Andrew Hsieh
2014-12-19Merge "Fix MIPS64 exception personality encoding" into release_35Andrew Hsieh
2014-12-18[ndk][backport] Workaround llvm.cttz.v2i64() problem.Logan Chien
Under some situation, the LLVM auto-vectorization will generate llvm.cttz.v2i64() which does not correspond to any ARM instruction and result in unmatched selection DAG. This backported patch workaround the problem by refining the cost model to avoid this intrinsics in this case. Patch originally by James Molloy - james molloy arm com [ARM] Teach the cost model that cross-class copies are costly. Cross-class copies being expensive is actually a trait of the microarchitecture, but as I haven't yet seen an example of a microarchitecture where they're cheap it seems best to just enable this by default, covering the non-mcpu build case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217674 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16[3.5] writable .gcc_except_table for mips[64]Andrew Hsieh
To match gcc behavior and avoid ld warning like this on project using prebuilts/ndk 's libc++ .../ld: warning: creating a DT_TEXTREL in a shared object. (*1) switch_to_exception_section (const char * ARG_UNUSED (fnname)) { .... if (EH_TABLES_CAN_BE_READ_ONLY) { int tt_format = ASM_PREFERRED_EH_DATA_FORMAT (/*code=*/0, /*global=*/1); # DW_EH_PE_indirect which is 0x80 flags = ((! flag_pic || ((tt_format & 0x70) != DW_EH_PE_absptr/*0x00*/ && (tt_format & 0x70) != DW_EH_PE_aligned/*0x50*/)) ? 0 : SECTION_WRITE); } else flags = SECTION_WRITE; ... } Note that PIC is the default for Android toolchain. For MIPS, ASM_PREFERRED_EH_DATA_FORMAT returns 0x80 and fails condition about "DW_EH_PE_absptr" The exact reason why .gcc_except_table contain reloc (and causes ld warning turns in to error thanks to -Wl,--fatal-warnings when it isn't writable thus requires load-time fixup in read-only section and security concern) isn't well understood, though Change-Id: I5e05ee052af48f06e8c4e3a01c317f4010a2ddaf
2014-12-10Fix MIPS64 exception personality encodingPetar Jovanovic
Remove dynamic relocations of __gxx_personality_v0 from the .eh_frame. The MIPS64 follow-up of the MIPS32 fix (rL209907). Patch by Vladimir Stefanovic. Differential Revision: http://reviews.llvm.org/D6141 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221408 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16Rename 3.5.0 -> 3.5Andrew Hsieh
Change-Id: Ib572077475a44e387c4b1b5d0ecd688be4e94a4e
2014-10-16python: AC_PATH_PROG -> AC_PATH_PROGS and fix search orderRay Donnelly
.. to python26 python2 python as some distros have python == python3 Change-Id: I2e1fbb6aa23614abc6df1581097400a31b7badac
2014-10-16Set aarch64-fix-cortex-a53-835769 as default for AArch64Andrew Hsieh
See previous CLs for details about Cortex-A53 erratum (835769) Flag aarch64-fix-cortex-a53-835769 is set to 1 or 0 by clang, depending on the presence of -mfix-cortex-a53-835769 or -mno-fix-cortex-a53-835769. If neither is given, this CL changes the default to "true" for aarch64-fix-cortex-a53-835769. It makes sense to have this workaround enabled by default in Android when apps generated by NDK toolchains can potentially run on affected A53. Android platform clang can specify -mno-fix-cortex-a53-835769 to disable this workaround for A53 known to be not affected. This changes also match the behavior in GCC r216079 when it's configured with --enable-fix-cortex-a53-835769, which is the case for NDK toolchains. Change-Id: I56d6c273f79c25d18e2cdd1b61015f35082bbb23
2014-10-16Fix crash with empty/pseudo-only blocks in A53 erratum (835769) workaroundBradley Smith
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219684 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16Add workaround for Cortex-A53 erratum (835769)Bradley Smith
Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is possible for a 64-bit multiply-accumulate instruction in AArch64 state to generate an incorrect result. The details are quite complex and hard to determine statically, since branches in the code may exist in some circumstances, but all cases end with a memory (load, store, or prefetch) instruction followed immediately by the multiply-accumulate operation. The safest work-around for this issue is to make the compiler avoid emitting multiply-accumulate instructions immediately after memory instructions and the simplest way to do this is to insert a NOP. This patch implements such work-around in the backend, enabled via the option -aarch64-fix-cortex-a53-835769. The work-around code generation is not enabled by default. Change-Id: I3a748b7f4cf38d43e0702057a5bd41eab5c47c4d git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219603 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-16[ndk] Fix inappropriate debug info assertion.WenHan Gu
This assertion is valid on single bitcode codegen. However, if we uses llvm-link many bitcodes, then llc. It may bump this assertion failure but everything should be fine.
2014-09-16[ndk][pndk] Let ndk-link support -static.WenHan Gu
2014-09-16[ndk][pndk] Add missing makefile filter for abcc plugin.WenHan Gu
2014-09-16[ndk][mips] Old binutils does not support nan directive.Logan Chien
2014-09-16[ndk][pndk] Update MemoryBuffer API usage.Logan Chien
2014-09-16[ndk][pndk] Implement ndk-translate on unknown 64-bit abi.WenHan Gu
2014-09-16[ndk][pndk] Fix ndk-link after rebasing to LLVM upstream 6/25.Logan Chien
2014-09-16[ndk][pndk] Unknown arch support for 64bit. Also add PIE support.WenHan Gu
2014-09-16[ndk] Add DISABLE_FUTIMENS to guard the use of futimensAndrew Hsieh
So $NDK/build/tools/build-llvm.sh can add -DDISABLE_FUTIMENS to remove dependencies of llvm binraries to futimens@GLIBC_2.6 doesn't exist in libc.so of some old linux systems.
2014-09-16Accept but ignore -rpath-link=Andrew Hsieh
Change-Id: Ie055ec1890350cd65cd496fe6416031a1248efb0
2014-09-16[ndk][conf] Add enable-shrink-binary-size to reduce size.Logan Chien
Add an option --enable-shrink-binary-size to reduce the cross- compiled LLVM binaries size. This patch will pass -fdata-sections and -ffunction-sections to $(CXX) and --gc-sections to the $(LD) when --enable-shrink-binary-size is enabled. This option is disabled by default; otherwise, the LLVM tools might not be able to load plugins (eg. LLVMPolly.so) This is ported from release_33 and release_34 branch, and squashing following patches: [1bb7be3] (release_33) WenHan Gu <Wenhan.gu@mediatek.com> Shrink binary sizes when cross-compiling. [6230037] (release_33) Ray Donnelly <mingw.android@gmail.com> Fixes for "Shrink binary sizes when cross-compiling." [43153c9] (release_33) Ray Donnelly <mingw.android@gmail.com> Fix config comment about -Wl,--gc-sections. [d44086b] (release_34) Lai Wei-Chih <Robert.Lai@mediatek.com> Add an option --enable-shrink-binary-size to configure.
2014-09-16[ndk][conf] Fix Canadian build.Logan Chien
1. Add new two new flags: CFLAGS_FOR_BUILD and LDFLAGS_FOR_BUILD and use them in $BUILD_CC test. 2. Unset $LDFLAGS in cross-compile-build-tools. 3. $ARCH can be overrided by environment variable. Cherry-picked from release_34 branch. Regenerated with ./autoconf/AutoRegen.sh.
2014-09-16[ndk][conf] Patch for canadian cross build via NDK standalone.WenHan Gu
2014-09-16[ndk][pndk] Script for bitcode to native binaries.Lai Wei-Chih
Author: Lai Wei-Chih <Robert.Lai@mediatek.com> Author: WenHan Gu <Wenhan.gu@mediatek.com> Author: Andrew Hsieh <andrewhsieh@google.com>
2014-09-16[ndk][pndk] Bitcode translation tool for le32-none-ndk triple.Lai Wei-Chih
2014-09-16[ndk][pndk] Bitcode strip tool for le32-none-ndk triple.Lai Wei-Chih
2014-09-16[ndk][pndk] Bitcode link tool for le32-none-ndk triple.Lai Wei-Chih
Author: Lai Wei-Chih <Robert.Lai@mediatek.com>; Author: WenHan Gu <Wenhan.gu@mediatek.com>; Author: Logan Chien <loganchien@google.com>;
2014-09-16[ndk][pndk] Add 2 tags in LLVMWrap.Lai Wei-Chih
* kAndroidBitcodeType Generate this bitcode for shared object or executable. * kAndroidLDFlags Store linker flags for this bitcode.
2014-09-16[ndk][pndk] Import Wrap library for bitcode wrapper.Lai Wei-Chih
source: https://android.googlesource.com/platform/frameworks/compile/libbcc/ commit: 25b7205e16e422469da74f88e74ad79e7c284ac7 libbcc/bcinfo libbcc/include/bcinfo
2014-09-16[ndk][pndk] Add ndk triple for Android bitcode.Lai Wei-Chih
Change-Id: I1b09d34632e59997402b034e7e652998ef8a8faa
2014-09-16[ndk][x86] Add option to store sp cookie in var.Logan Chien
Add an option to store the stack protector cookie in the global variable.
2014-09-16[ndk][arm] Conditional compile fenv.h for NDK headers.WenHan Gu
NDK armeabi default use -msoft-float, and it cannot truely use fenv.h since it has vmrs,vmsr inline asm. Change-Id: If49045f2bfed0f33f7dc9156d5eea3bda89ad90a
2014-09-16[ndk] Add -disable-global-ctor-const-promotion option.Logan Chien
The global variable optimization will try to evaluate the constant global variable and remove the constructors. However, some legacy code will try to perform const_cast<>() and assign to the "constant" to avoid static initialization order fiasco. To workaround those old code, this commit adds a new option to disable the promotion of the global constant with constructors.
2014-09-11Merging r217490:Dan Liew
------------------------------------------------------------------------ r217490 | delcypher | 2014-09-10 12:09:23 +0100 (Wed, 10 Sep 2014) | 4 lines Don't attempt to run llvm-config in cmake/modules/Makefile when doing ``make clean`` because it won't be available. This is an attempt to unbreak buildbots broken by r217484. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@217640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-11Merging r217484:Dan Liew
------------------------------------------------------------------------ r217484 | delcypher | 2014-09-10 11:18:59 +0100 (Wed, 10 Sep 2014) | 13 lines Attempt to fix PR20884 This fixes the generation of broken LLVMExports.cmake file by the Autoconf/Makefile build system when --enable-shared is passed to configure. When --enable_shared is passed the Makefile.rules does not set the LLVMConfigLibs variable which cmake/modules/Makefile previously relied on. Now it runs the llvm-config command itself to get the library names. This still isn't perfect because the generated LLVM targets refer to the static libraries and not the shared library but that is much larger problem to fix. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@217638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-06Update PowerPC target information.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@217304 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02Update release notes.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02Update lang ref.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216950 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29Include blurb about Likely. By Josh Klontz.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216762 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29Update to include ISPC. By Dmitry Babokin.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216760 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20Merging r216064:Bill Wendling
------------------------------------------------------------------------ r216064 | kongyi | 2014-08-20 03:40:20 -0700 (Wed, 20 Aug 2014) | 9 lines ARM: Fix codegen for rbit intrinsic LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic. According to ARM ARM, rbit only takes register as argument, not immediate. The correct instruction should be rbit <Rd>, <Rm>. The bug was originally introduced in r211057. Differential Revision: http://reviews.llvm.org/D4980 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@216089 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18Merging r215711:Bill Wendling
------------------------------------------------------------------------ r215711 | wschmidt | 2014-08-15 06:51:57 -0700 (Fri, 15 Aug 2014) | 8 lines [PPC64] Add test case for r215685. I had deferred adding this test case until I could get it down to a reasonable size. That's done now. Thanks, Bill ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215879 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18Merging r215685:Bill Wendling
------------------------------------------------------------------------ r215685 | wschmidt | 2014-08-14 18:25:26 -0700 (Thu, 14 Aug 2014) | 69 lines [PPC64] Add missing dependency on X2 to LDinto_toc. The LDinto_toc pattern has been part of 64-bit PowerPC for a long time, and represents loading from a memory location into the TOC register (X2). However, this pattern doesn't explicitly record that it modifies that register. This patch adds the missing dependency. It was very surprising to me that this has never shown up as a problem in the past, and that we only saw this problem recently in a single scenario when building a self-hosted clang. It turns out that in most cases we have another dependency present that keeps the LDinto_toc instruction tied in place. LDinto_toc is used for TOC restore following a call site, so this is a typical sequence: BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X12<imp-use>, %X1<imp-def>, ... LDinto_toc 24, %X1 ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use> Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there is a natural anti-dependency between the two that keeps it in place. Therefore we don't usually see a problem. However, in one particular case, one call is followed immediately by another call, and the second call requires a parameter that is a TOC-relative address. This is the code sequence: BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ... LDinto_toc 24, %X1 ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use> ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39 %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39 Note that the back-to-back stack adjustments are the same size! The back end is smart enough to recognize this and optimize them away: BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ... LDinto_toc 24, %X1 %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39 %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39 Now there is nothing to prevent the ADDIStocHA instruction from moving ahead of the LDinto_toc instruction, and because of the longest-path heuristic, this is what happens. With the accompanying patch, %X2 is represented as an implicit def: BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ... LDinto_toc 24, %X1, %X2<imp-def,dead> ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use> ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use> %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39 %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39 So now when the two stack adjustments are removed, ADDIStocHA is prevented from being moved above LDinto_toc. I have not yet created a test case for this, because the original failure occurs on a relatively large function that needs reduction. However, this is a fairly serious bug, despite its infrequency, and I wanted to get this patch onto the list as soon as possible so that it can be considered for a 3.5 backport. I'll work on whittling down a test case. Have we missed the boat for 3.5 at this point? Thanks, Bill ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18Merging r215806:Bill Wendling
------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215874 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12Merging r214679:Bill Wendling
------------------------------------------------------------------------ r214679 | chandlerc | 2014-08-03 17:54:28 -0700 (Sun, 03 Aug 2014) | 10 lines [x86] Fix the test case added in r214670 and tweaked in r214674 further. Fundamentally, there isn't a really portable way to test the constant pool contents. Instead, pin this test to the bare-metal triple. This also makes it a 64-bit triple which allows us to only match a single constant pool rather than two. It can also just hard code the '.' prefix as the format should be stable now that it has a fixed triple. Finally, I've switched it to use CHECK-NEXT to be more precise in the instruction sequence expected and to use variables rather than hard coding decisions by the register allocator. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215430 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12Merging r214674:Bill Wendling
------------------------------------------------------------------------ r214674 | spatel | 2014-08-03 16:20:16 -0700 (Sun, 03 Aug 2014) | 2 lines Account for possible leading '.' in label string. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215429 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12Merging r214670:Bill Wendling
------------------------------------------------------------------------ r214670 | spatel | 2014-08-03 15:48:23 -0700 (Sun, 03 Aug 2014) | 8 lines fix for PR20354 - Miscompile of fabs due to vectorization This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation. This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon. There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215428 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12Revert r.215058.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215426 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07Added pocl and TCE to the list of projects that work with Clang/LLVM 3.5.Pekka Jaaskelainen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215090 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07Merging r214481:Bill Wendling
------------------------------------------------------------------------ r214481 | hfinkel | 2014-07-31 22:20:41 -0700 (Thu, 31 Jul 2014) | 38 lines [PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads Altivec vector loads on PowerPC have an interesting property: They always load from an aligned address (by rounding down the address actually provided if necessary). In order to generate an actual unaligned load, you can generate two load instructions, one with the original address, one offset by one vector length, and use a special permutation to extract the bytes desired. When this was originally implemented, I generated these two loads using regular ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with this: The alignment of a load does not contribute to its identity, and SDNodes are uniqued. So, imagine that we have some unaligned load, L1, that is not aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned). Further imagine that there had already existed a load (L1+16)(unaligned) with the same chain operand as the load L1. When (L1+16)(aligned) is created as part of the lowering of L1, this load *is* also the (L1+16)(unaligned) node, just now marked as aligned (because the new alignment overwrites the old). But the original users of (L1+16)(unaligned) now get the data intended for the permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists to get its own permutation-based expansion. This was PR19991. A second potential problem has to do with the MMOs on these loads, which can be used by AA during instruction scheduling to break chain-based dependencies. If the new "aligned" loads get the MMO from the original unaligned load, this does not represent the fact that it will load data from below the original address. Normally, this would not matter, but this load might be combined with another load pair for a previous vector, and then the dependency on the otherwise- ignored lower bytes can matter. To fix both problems, instead of generating the necessary loads using regular ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are provided with MMOs with a conservative address range. Unfortunately, I no longer have a failing test case (since PR19991 was reported, other changes in CodeGen have forced this bug back into hiding it again). Nevertheless, this should fix the underlying problem. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_35@215058 91177308-0d34-0410-b5e6-96231b3b80d8