external/arm-optimized-routines.git

Age	Commit message (Collapse)	Author
2022-12-09	pl/math: Add polynomial helpers	Joe Ramsay
	Add macros for simplifying polynomial evaluation using either Horner, pairwise Horner or Estrin. Several routines have been modified to use the new helpers. Readability is improved slightly, and we expect that this will make prototyping new routines simpler.
2022-12-09	pl/math/test: Simplify runulp.sh	Joe Ramsay
	Small simplification - pl routines do not support different rounding modes, so there is no need to support them in runulp.sh. As a result we can also remove Ldir.
2022-12-08	pl/math: Fix fenv in asinh	Joe Ramsay
	Special lanes were not being properly masked when a lane was tiny. This is now fixed.
2022-12-08	pl/math: Fix vector/SVE erf	Pierre Blanchard
	Fixing a bug that resulted in potentially random results in boring domain by saturating index at an appropriate value.
2022-12-07	math: Set fenv exceptions for several Neon routines	Joe Ramsay
	In most cases, we mask lanes which should not trigger exceptions with a neutral value, then let the existing special-case handler fix them up later. For exp and exp2 we replace the more complex special-case handler with a simple scalar fallback. All new behaviour is tested in runulp.sh, with a new option to pass -f to the run line. We also extend the fenv testing to Neon log and logf, which already triggered exceptions correctly. New behaviour is mostly hidden behind a new config setting, WANT_SIMD_EXCEPT.
2022-12-07	string: arm: Fix cfi restore info for hot loop exit	Victor Do Nascimento
	The branch out of the core memchr loop to label 60 jumps over the popping of registers r4-r7. The restoration of the cfi state at 60 is adjusted to reflect this fact, avoiding restoring a state where r4-r7 have already been popped off the stack. Built w/ arm-none-linux-gnueabihf, ran make check-string w/ qemu-arm-static.
2022-12-07	string: arm: Ensure correct cfi state at strcmp entry	Victor Do Nascimento
	Move code fragment corresponding to L(fastpath_exit) to after function entry so that a .cfi_remember_state/.cfi_restore_state pair are not needed prior to strcmp start. The resulting reshuffle of code cleans up the entry part, fixing the .size directive calculation, which at present calculates the function size based on the address of __strcmp_arm and not L(strcmp_start_addr).
2022-12-06	pl/math: Set fenv flags in Neon asinhf	Joe Ramsay
	Routine no longer relies on vector log1pf, as this has to become more complex to deal with fenv itself. Instead we re-use a log1pf helper from Neon atanhf which does no special-case handling, instead leaving it all up to the main routine. We now just fall back to the scalar routine for special-case handling. This uncovered a mistake in asinhf's handling of NaNs, which has been fixed.
2022-12-05	pl/math: Avoid UB in scalar tanhf	Joe Ramsay
	The ldexp shortcut was left-shifting a signed value. We now bias the exponent first, will allows the shift to be done on an unsigned value.
2022-11-30	pl/math: Add scalar and vector/Neon tanhf	Joe Ramsay
	Both routines use simplified inline versions of expm1f, and are accurate to 2.6 ULP.
2022-11-29	pl/math: Add vector/Neon asinh	Joe Ramsay
	New routine uses two separate algorithms for input greater and less than 1 (similar to the scalar routine). It is accurate to 2.5 ULP.
2022-11-24	pl/math: Update ULP threshold for vector atans	Joe Ramsay
	New max observed for both Neon and SVE.
2022-11-22	pl/math: Add scalar & vector/Neon cbrtf	Joe Ramsay
	Both routines use the same algorithm - one Newton iteration with the initial guess obtained by a low-order polynomial. Scalar is used as a fallback for subnormal and special cases for the vector routine, which allows vastly simplified argument reduction and reassembly. Both routines accurate to 1.5 ULP.
2022-11-22	pl/math: Add scalar and vector/Neon atanhf	Joe Ramsay
	Both routines are based on a simplified version of log1pf, and are accurate to 3.1 ULP. Also enabled -c flag from runulp.sh - we need this for atanhf so that we can set the control lane to something other than 1, since atanh(1) is infinite.
2022-11-17	Build the optimized memset(). am: 7aea3e8806 am: b5227be158 am: b3a2c82c1f	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572 Change-Id: I026f09de07f4abc5fb111ba3493bbee9bfc58ab7 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17	Build the optimized memset(). am: 7aea3e8806 am: b5227be158	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572 Change-Id: Ifdb9788e5c6758265ca04f50530c9656883bc222 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17	Build the optimized memset(). am: 7aea3e8806	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572 Change-Id: I1c3f88174e00550ab598f22638364dde4fd79c96 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17	Build the optimized memcpy() and memmove(). am: 0184f5179e am: 01348af37a ↵	Elliott Hughes
	am: 722a6d8de7 Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835 Change-Id: Ia96ff6bbc0b2fb7c7478573c9585800fbb26a47e Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17	Build the optimized memcpy() and memmove(). am: 0184f5179e am: 01348af37a	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835 Change-Id: I1cc3d0c50479d1f86b761ff7f09ebb99390cdc1c Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17	Build the optimized memcpy() and memmove(). am: 0184f5179e	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835 Change-Id: Iafefe4c746d77725505fe364a96011b1c1c7bf7e Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-17	Build the optimized memset().	Elliott Hughes
	Test: treehugger Change-Id: Ibae0859e9683d10ba53113baeba26f720d44d674
2022-11-17	string: arm: Refactor ENTRY/END macros	Szabolcs Nagy
	The .fnstart/.fnend directives can be inlined now that asmdefs.h is arm specific.
2022-11-17	string: arm: Use /**/ comments in asmdefs.h	Szabolcs Nagy
	This is preprocessed asm code, so /**/ style comments are most appropriate.
2022-11-17	string: arm: Include asmdefs.h even into empty asm files	Szabolcs Nagy
	Currently this is not expected to change behaviour, but if global directives are added in asmdefs.h (like .thumb) those should be in all asm files in case the link ABI is affected.
2022-11-17	string: Add separate asmdefs.h per target	Szabolcs Nagy
	The definitions in this header are necessarily target specific, so better to have a separate version in each target directory.
2022-11-17	string: arm: Fix build failure	Szabolcs Nagy
	asmdefs.h ifdef logic was wrong: arm only macro definitions were outside of defined(__arm__). Added some ifdef indentation to make the code more readable.
2022-11-17	math/test: Fix ulp for non-AArch64 targets	Joe Ramsay
	fv and dv are only declared under __aarch64__ - for other targets the new -c option should be disabled.
2022-11-17	pl/math: Add scalar & vector/Neon cosh	Joe Ramsay
	New routines are based on double-precision exp, both accurate to 2 ULP.
2022-11-17	pl/math: Add scalar and vector/Neon sinh	Joe Ramsay
	New routines are based on the single-precision versions and are accurate to 3 ULP.
2022-11-17	Build the optimized memcpy() and memmove().	Elliott Hughes
	Test: treehugger Change-Id: I545117b6b8283f0b5eca2f4591579393720c7960
2022-11-16	Merge "Add mem, str functions to baremetal static lib" am: 46490941f7 am: ↵	Treehugger Robot
	d85c7f4e12 am: dfd77cd57c Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393 Change-Id: I1b1d024d51f30ad9f686e7581eb15ce76016dbac Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-16	Merge "Add mem, str functions to baremetal static lib" am: 46490941f7 am: ↵	Treehugger Robot
	d85c7f4e12 Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393 Change-Id: Ifeae3e8c493fd232b15fbe2127c749faef337b1d Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-16	Merge "Add mem, str functions to baremetal static lib" am: 46490941f7	Treehugger Robot
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393 Change-Id: Iaf6a1e6449a778895618ba47eabed3fde7087fe5 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-16	Merge "Add mem, str functions to baremetal static lib"	Treehugger Robot

2022-11-15	Android.bp: Change file mode to non-executable am: 3d7d7fab1c am: b32a07c077 ↵	Pierre-Clément Tosi
	am: 892ed4034a Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998 Change-Id: I3c887444b18412dd598a7f5ca5edacfb040e69a6 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-15	Android.bp: Change file mode to non-executable am: 3d7d7fab1c am: b32a07c077	Pierre-Clément Tosi
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998 Change-Id: I7760d1660e6a5f0d966f0d579986c11337210dbe Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-15	Android.bp: Change file mode to non-executable am: 3d7d7fab1c	Pierre-Clément Tosi
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998 Change-Id: Ia1b150c236a53799d031e1e45dc9c7dc001b748e Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-11-15	pl/math: Use order-6 polynomial in Vector/Neon log2	Nicholas Dingle
	Reduce the order of the polynomial used in Neon log2 by one (from 7 to 6). In order to calculate the new coefficients required we rescale the coefficients from log_data.c by log2(e) in extended precision and round back. The maximum observed error is unchanged (2.59 ULPs) but the point at which it is observed has changed slightly.
2022-11-15	math/test: Allow user to set control element of input vector	Joe Ramsay
	argf and argd have been designed such that non-special input is tested, optionally followed by a vector with one special lane. To be able to test that vector functions have correct behaviour w.r.t. fenv exceptions, we need to be able to choose a different value for the last lane, as using 1 leads to false negatives when testing a function for which 1 is a special value. We add an option, -c, for the user to provide a different control value.
2022-11-15	Android.bp: Change file mode to non-executable	Pierre-Clément Tosi
	The file is a build configuration and shouldn't be executable. Test: TH # No change intended Change-Id: I76ab86cd2971160b7f376bdda0d59da36c50a59b
2022-11-15	Add mem, str functions to baremetal static lib	Pierre-Clément Tosi
	Add more libc helper functions, to be used by baremetal Rust targets. Bug: 255521657 Test: atest vmbase_example.integration_test # used by aosp/2138640 Change-Id: I6ec50bc37d0851c5fd47902f34a25b6178e36ed3
2022-11-15	pl/math: Change conflicting variable names	Joe Ramsay
	There is collision for math-tests and math-rtests between math/ and pl/math, which can lead to failures if running both concurrently. We rename the pl-specific lists to avoid this.
2022-11-11	pl/math: Fix minus zero in vector expm1	Joe Ramsay
	Extra special-case check.
2022-11-11	pl/math: Fix SVE mathbench wrappers	Joe Ramsay
	These were broken in the previous patch, now fixed.
2022-11-09	pl/math/test: Simplify ulp and bench macros	Joe Ramsay
	Reduces the amount of boilerplate developers need to write for new routines.
2022-11-09	pl/math: Add vector/Neon expm1	Joe Ramsay
	New routine is a vector port of the scalar algorithm, with fallback to the scalar variant for large and special input. This enables us to simplify elements of the algorithm which were necessary for large input. It also means that, as long as we fall back to the scalar for tiny input as well (dependent on the value of WANT_ERRNO), the routine sets fenv flags correctly.
2022-11-09	pl/math: Add scalar expm1	Joe Ramsay
	New routine uses the same algorithm as the single-precision routine, and is accurate to 2.5 ULP.
2022-11-09	pl/math: Add scalar and vector/Neon coshf	Add joeram01
	New routines use single-precision exp, which has been copied from math/. Scalar is accurate to 1.9 ULP, Neon to 2.4 ULP. Also use the new expf helper in scalar sinhf.
2022-11-09	Make fenv checking dependent on WANT_ERRNO	Joe Ramsay
	We want these tests to pass regardless of whether the user has enabled or disabled WANT_ERRNO - this is now supported by a WANT_ERRNO config option, which will be added to config.mk.dist in a follow-on.
2022-11-09	Fix tests for WANT_SVE_MATH=1	Joe Ramsay
	Also skips the test line when D is not "fenv" or empty, i.e. when it is the SVE if statement. This used to work but was broken by adding the D variable, so the tests did not run properly when WANT_SVE_MATH was enabled. Now fixed.