aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-12-09pl/math: Add polynomial helpersJoe Ramsay
Add macros for simplifying polynomial evaluation using either Horner, pairwise Horner or Estrin. Several routines have been modified to use the new helpers. Readability is improved slightly, and we expect that this will make prototyping new routines simpler.
2022-12-09pl/math/test: Simplify runulp.shJoe Ramsay
Small simplification - pl routines do not support different rounding modes, so there is no need to support them in runulp.sh. As a result we can also remove Ldir.
2022-12-08pl/math: Fix fenv in asinhJoe Ramsay
Special lanes were not being properly masked when a lane was tiny. This is now fixed.
2022-12-08pl/math: Fix vector/SVE erfPierre Blanchard
Fixing a bug that resulted in potentially random results in boring domain by saturating index at an appropriate value.
2022-12-07math: Set fenv exceptions for several Neon routinesJoe Ramsay
In most cases, we mask lanes which should not trigger exceptions with a neutral value, then let the existing special-case handler fix them up later. For exp and exp2 we replace the more complex special-case handler with a simple scalar fallback. All new behaviour is tested in runulp.sh, with a new option to pass -f to the run line. We also extend the fenv testing to Neon log and logf, which already triggered exceptions correctly. New behaviour is mostly hidden behind a new config setting, WANT_SIMD_EXCEPT.
2022-12-07string: arm: Fix cfi restore info for hot loop exitVictor Do Nascimento
The branch out of the core memchr loop to label 60 jumps over the popping of registers r4-r7. The restoration of the cfi state at 60 is adjusted to reflect this fact, avoiding restoring a state where r4-r7 have already been popped off the stack. Built w/ arm-none-linux-gnueabihf, ran make check-string w/ qemu-arm-static.
2022-12-07string: arm: Ensure correct cfi state at strcmp entryVictor Do Nascimento
Move code fragment corresponding to L(fastpath_exit) to after function entry so that a .cfi_remember_state/.cfi_restore_state pair are not needed prior to strcmp start. The resulting reshuffle of code cleans up the entry part, fixing the .size directive calculation, which at present calculates the function size based on the address of __strcmp_arm and not L(strcmp_start_addr).
2022-12-06pl/math: Set fenv flags in Neon asinhfJoe Ramsay
Routine no longer relies on vector log1pf, as this has to become more complex to deal with fenv itself. Instead we re-use a log1pf helper from Neon atanhf which does no special-case handling, instead leaving it all up to the main routine. We now just fall back to the scalar routine for special-case handling. This uncovered a mistake in asinhf's handling of NaNs, which has been fixed.
2022-12-05pl/math: Avoid UB in scalar tanhfJoe Ramsay
The ldexp shortcut was left-shifting a signed value. We now bias the exponent first, will allows the shift to be done on an unsigned value.
2022-11-30pl/math: Add scalar and vector/Neon tanhfJoe Ramsay
Both routines use simplified inline versions of expm1f, and are accurate to 2.6 ULP.
2022-11-29pl/math: Add vector/Neon asinhJoe Ramsay
New routine uses two separate algorithms for input greater and less than 1 (similar to the scalar routine). It is accurate to 2.5 ULP.
2022-11-24pl/math: Update ULP threshold for vector atansJoe Ramsay
New max observed for both Neon and SVE.
2022-11-22pl/math: Add scalar & vector/Neon cbrtfJoe Ramsay
Both routines use the same algorithm - one Newton iteration with the initial guess obtained by a low-order polynomial. Scalar is used as a fallback for subnormal and special cases for the vector routine, which allows vastly simplified argument reduction and reassembly. Both routines accurate to 1.5 ULP.
2022-11-22pl/math: Add scalar and vector/Neon atanhfJoe Ramsay
Both routines are based on a simplified version of log1pf, and are accurate to 3.1 ULP. Also enabled -c flag from runulp.sh - we need this for atanhf so that we can set the control lane to something other than 1, since atanh(1) is infinite.
2022-11-17Build the optimized memset(). am: 7aea3e8806 am: b5227be158 am: b3a2c82c1fElliott Hughes
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572 Change-Id: I026f09de07f4abc5fb111ba3493bbee9bfc58ab7 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17Build the optimized memset(). am: 7aea3e8806 am: b5227be158Elliott Hughes
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572 Change-Id: Ifdb9788e5c6758265ca04f50530c9656883bc222 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17Build the optimized memset(). am: 7aea3e8806Elliott Hughes
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572 Change-Id: I1c3f88174e00550ab598f22638364dde4fd79c96 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17Build the optimized memcpy() and memmove(). am: 0184f5179e am: 01348af37a ↵Elliott Hughes
am: 722a6d8de7 Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835 Change-Id: Ia96ff6bbc0b2fb7c7478573c9585800fbb26a47e Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17Build the optimized memcpy() and memmove(). am: 0184f5179e am: 01348af37aElliott Hughes
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835 Change-Id: I1cc3d0c50479d1f86b761ff7f09ebb99390cdc1c Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17Build the optimized memcpy() and memmove(). am: 0184f5179eElliott Hughes
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835 Change-Id: Iafefe4c746d77725505fe364a96011b1c1c7bf7e Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17Build the optimized memset().Elliott Hughes
Test: treehugger Change-Id: Ibae0859e9683d10ba53113baeba26f720d44d674
2022-11-17string: arm: Refactor ENTRY/END macrosSzabolcs Nagy
The .fnstart/.fnend directives can be inlined now that asmdefs.h is arm specific.
2022-11-17string: arm: Use /**/ comments in asmdefs.hSzabolcs Nagy
This is preprocessed asm code, so /**/ style comments are most appropriate.
2022-11-17string: arm: Include asmdefs.h even into empty asm filesSzabolcs Nagy
Currently this is not expected to change behaviour, but if global directives are added in asmdefs.h (like .thumb) those should be in all asm files in case the link ABI is affected.
2022-11-17string: Add separate asmdefs.h per targetSzabolcs Nagy
The definitions in this header are necessarily target specific, so better to have a separate version in each target directory.
2022-11-17string: arm: Fix build failureSzabolcs Nagy
asmdefs.h ifdef logic was wrong: arm only macro definitions were outside of defined(__arm__). Added some ifdef indentation to make the code more readable.
2022-11-17math/test: Fix ulp for non-AArch64 targetsJoe Ramsay
fv and dv are only declared under __aarch64__ - for other targets the new -c option should be disabled.
2022-11-17pl/math: Add scalar & vector/Neon coshJoe Ramsay
New routines are based on double-precision exp, both accurate to 2 ULP.
2022-11-17pl/math: Add scalar and vector/Neon sinhJoe Ramsay
New routines are based on the single-precision versions and are accurate to 3 ULP.
2022-11-17Build the optimized memcpy() and memmove().Elliott Hughes
Test: treehugger Change-Id: I545117b6b8283f0b5eca2f4591579393720c7960
2022-11-16Merge "Add mem*, str* functions to baremetal static lib" am: 46490941f7 am: ↵Treehugger Robot
d85c7f4e12 am: dfd77cd57c Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393 Change-Id: I1b1d024d51f30ad9f686e7581eb15ce76016dbac Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-16Merge "Add mem*, str* functions to baremetal static lib" am: 46490941f7 am: ↵Treehugger Robot
d85c7f4e12 Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393 Change-Id: Ifeae3e8c493fd232b15fbe2127c749faef337b1d Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-16Merge "Add mem*, str* functions to baremetal static lib" am: 46490941f7Treehugger Robot
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393 Change-Id: Iaf6a1e6449a778895618ba47eabed3fde7087fe5 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-16Merge "Add mem*, str* functions to baremetal static lib"Treehugger Robot
2022-11-15Android.bp: Change file mode to non-executable am: 3d7d7fab1c am: b32a07c077 ↵Pierre-Clément Tosi
am: 892ed4034a Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998 Change-Id: I3c887444b18412dd598a7f5ca5edacfb040e69a6 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-15Android.bp: Change file mode to non-executable am: 3d7d7fab1c am: b32a07c077Pierre-Clément Tosi
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998 Change-Id: I7760d1660e6a5f0d966f0d579986c11337210dbe Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-15Android.bp: Change file mode to non-executable am: 3d7d7fab1cPierre-Clément Tosi
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998 Change-Id: Ia1b150c236a53799d031e1e45dc9c7dc001b748e Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-15pl/math: Use order-6 polynomial in Vector/Neon log2Nicholas Dingle
Reduce the order of the polynomial used in Neon log2 by one (from 7 to 6). In order to calculate the new coefficients required we rescale the coefficients from log_data.c by log2(e) in extended precision and round back. The maximum observed error is unchanged (2.59 ULPs) but the point at which it is observed has changed slightly.
2022-11-15math/test: Allow user to set control element of input vectorJoe Ramsay
argf and argd have been designed such that non-special input is tested, optionally followed by a vector with one special lane. To be able to test that vector functions have correct behaviour w.r.t. fenv exceptions, we need to be able to choose a different value for the last lane, as using 1 leads to false negatives when testing a function for which 1 is a special value. We add an option, -c, for the user to provide a different control value.
2022-11-15Android.bp: Change file mode to non-executablePierre-Clément Tosi
The file is a build configuration and shouldn't be executable. Test: TH # No change intended Change-Id: I76ab86cd2971160b7f376bdda0d59da36c50a59b
2022-11-15Add mem*, str* functions to baremetal static libPierre-Clément Tosi
Add more libc helper functions, to be used by baremetal Rust targets. Bug: 255521657 Test: atest vmbase_example.integration_test # used by aosp/2138640 Change-Id: I6ec50bc37d0851c5fd47902f34a25b6178e36ed3
2022-11-15pl/math: Change conflicting variable namesJoe Ramsay
There is collision for math-tests and math-rtests between math/ and pl/math, which can lead to failures if running both concurrently. We rename the pl-specific lists to avoid this.
2022-11-11pl/math: Fix minus zero in vector expm1Joe Ramsay
Extra special-case check.
2022-11-11pl/math: Fix SVE mathbench wrappersJoe Ramsay
These were broken in the previous patch, now fixed.
2022-11-09pl/math/test: Simplify ulp and bench macrosJoe Ramsay
Reduces the amount of boilerplate developers need to write for new routines.
2022-11-09pl/math: Add vector/Neon expm1Joe Ramsay
New routine is a vector port of the scalar algorithm, with fallback to the scalar variant for large and special input. This enables us to simplify elements of the algorithm which were necessary for large input. It also means that, as long as we fall back to the scalar for tiny input as well (dependent on the value of WANT_ERRNO), the routine sets fenv flags correctly.
2022-11-09pl/math: Add scalar expm1Joe Ramsay
New routine uses the same algorithm as the single-precision routine, and is accurate to 2.5 ULP.
2022-11-09pl/math: Add scalar and vector/Neon coshfAdd joeram01
New routines use single-precision exp, which has been copied from math/. Scalar is accurate to 1.9 ULP, Neon to 2.4 ULP. Also use the new expf helper in scalar sinhf.
2022-11-09Make fenv checking dependent on WANT_ERRNOJoe Ramsay
We want these tests to pass regardless of whether the user has enabled or disabled WANT_ERRNO - this is now supported by a WANT_ERRNO config option, which will be added to config.mk.dist in a follow-on.
2022-11-09Fix tests for WANT_SVE_MATH=1Joe Ramsay
Also skips the test line when D is not "fenv" or empty, i.e. when it is the SVE if statement. This used to work but was broken by adding the D variable, so the tests did not run properly when WANT_SVE_MATH was enabled. Now fixed.