external/arm-optimized-routines.git

Age	Commit message (Collapse)	Author
2023-01-27	Build SVE routines. am: ffea11cb14 am: 089cb05b99 am: d83151a6a0android-14.0.0_r37 android-14.0.0_r36 android-14.0.0_r35 android-14.0.0_r34 android-14.0.0_r33 android-14.0.0_r32 android-14.0.0_r31 android-14.0.0_r30 android-14.0.0_r29 android-14.0.0_r27 android-14.0.0_r26 android-14.0.0_r25 android-14.0.0_r24 android-14.0.0_r23 android-14.0.0_r22 android-14.0.0_r21 android-14.0.0_r20 android-14.0.0_r19 android-14.0.0_r18 android-14.0.0_r17 android-14.0.0_r16 aml_rkp_341510000 aml_rkp_341311000 aml_rkp_341114000 aml_rkp_341015010 aml_rkp_341012000 aml_hef_341613000 aml_hef_341512030 aml_hef_341415040 aml_hef_341311010 aml_hef_341114030 aml_cfg_341510000 android14-qpr2-s5-release android14-qpr2-s4-release android14-qpr2-s3-release android14-qpr2-s2-release android14-qpr2-s1-release android14-qpr2-release android14-qpr1-s2-release android14-qpr1-release android14-mainline-healthfitness-release android14-dev	Jake Weinstein
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2292660 Change-Id: I66ad991cf1fac06969f6ef33a633feb1c7b0f9b4 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2023-01-27	Build SVE routines. am: ffea11cb14 am: 089cb05b99	Jake Weinstein
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2292660 Change-Id: Ia9342d61d347c848df56818a754589a4e94bba9c Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2023-01-27	Build SVE routines. am: ffea11cb14android-u-beta-1-gpl	Jake Weinstein
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2292660 Change-Id: I426442ab3a92baaf5697cc7a95a27fec14121094 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2023-01-26	Build SVE routines.	Jake Weinstein
	Test: treehugger Change-Id: I64eb06a9d17c229abb026439d0cdd36ba646eaf4
2023-01-26	Upgrade ARM-software/optimized-routines to v23.01 am: 62662f115a am: ↵	Elliott Hughes
	8783f524be am: 8a481eb48c Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2402215 Change-Id: Id535bcfef3f5b328f4666742a1623794a20f982d Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2023-01-25	Upgrade ARM-software/optimized-routines to v23.01 am: 62662f115a am: 8783f524be	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2402215 Change-Id: Ief5438cf89d86c43e5f52d5e48b882bacb945170 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2023-01-25	Upgrade ARM-software/optimized-routines to v23.01 am: 62662f115a	Elliott Hughes
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2402215 Change-Id: Ic29e1a6f8a0485f6317699bfda38fb79e7022ee1 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2023-01-25	Upgrade ARM-software/optimized-routines to v23.01	Elliott Hughes
	This project was upgraded with external_updater. Usage: tools/external_updater/updater.sh update arm-optimized-routines For more info, check https://cs.android.com/android/platform/superproject/+/master:tools/external_updater/README.md Test: TreeHugger Change-Id: I03de0ecb0ea89c2a77b1c411685af09fd480b63e
2023-01-24	v23.01 release	Szabolcs Nagy
	* Project changes * All files are under a new dual license now (MIT OR Apache-2.0 WITH LLVM-exception at the election of the user). * Added MAINTAINERS file describing who maintains the subdirectories. * Added README.contributors files documenting contribution requirements. * Added new pl/ subdirectory for Arm's Performance Library related routines. * String routine changes * Added memset benchmark. * Improved strlen and memcpy benchmarks. * Added SVE memcpy. * Updated arm string functions to support M-profile PACBTI. * Merged the MTE and generic versions of strcmp, strncmp, strcpy and stpcpy into one implementation. * Optimized memcmp, memchr-mte, memrchr, strchr-mte, strchrnul-mte, strrchr-mte, strlen, strlen-mte, strnlen, strcpy. * Math routine changes * Fixed constants in sinf, cosf and sincosf to be compile time computed even with gcc-12 -frounding-math. * Fixed an invalid shift in logf. * Support floating-point exceptions in vector math routines when WANT_SIMD_EXCEPT is set.
2023-01-24	pl/math: Fix a copyright notice for consistency	Szabolcs Nagy
	The (c) is not strictly required, but it was only missing from one file.
2023-01-24	Update copyright years	Szabolcs Nagy
	Scripted copyright year updates based on git committer date.
2023-01-24	string: Improve SVE memcpy	Wilco Dijkstra
	Improve SVE memcpy by copying 2 vectors. This avoids a check on vector length and improves performance of random memcpy.
2023-01-23	pl/math: Reduce order of single-precision tan polynomial	Joe Ramsay
	For both vector and scalar routines we reduce the order from 6 to 5. For vector routines, this requires reducing RangeVal as for large values the tan polynomial is not quite accurate enough. However the cotan polynomial is used in the inaccurate region in the scalar routine, so this does not need to change. Accuracy of scalar routine is unchanged. Accuracy in both vector routines is now 3.45 ULP, with the same worst-case.
2023-01-19	pl/math: Add vector/Neon tan	Joe Ramsay
	New routine uses a similar technique to the single-precision Neon routine, but with an extra reduction to pi/8 using the double-angle formula. It is accurate to 3.5 ULP.
2023-01-10	string: Compile memcpy-sve.S for aarch64 if compiler supports it	Jake Weinstein
	This is a partial revert of b7e368fb. If SVE assembly is guarded by __ARM_FEATURE_SVE, it cannot build when SVE is not enabled by the build system. This is ok on AOR, but because Android (bionic) uses ifuncs to select the appropriate assembly at runtime, these need to compile regardless of if the target actually supports the instructions. Check for AArch64 and GCC >= 8 or Clang >= 5 so that SVE is not used on compilers that do not support it. This condition will always be true on future builds of Android for AArch64.
2023-01-10	string: Optimize strcpy	Wilco Dijkstra
	Optimize strcpy main loop - large strings are ~22% faster.
2023-01-10	string: Improve strrchr-mte	Wilco Dijkstra
	Use shrn for narrowing the mask which simplifies code. Unroll the strchr search loop which improves performance on large strings.
2023-01-09	pl/math: Fix benchmark entries for SVE bivariate functions	Pierre Blanchard
	Variant was wrongly set in structures used to benchmark SVE functions. Before this change only half of the lanes were set as expected. Also reformat for ease of reading.
2023-01-06	pl/math: Update copyright years	Joe Ramsay
	All files in pl/math updated to 2023.
2023-01-05	Rewrite two abs masks as literals	Joe Ramsay
	These were technically undefined behaviour - they have been rewritten without the shift so that their type is unsigned int by default.
2023-01-05	pl/math: Add vector/Neon acosh	Joe Ramsay
	New routine is based on a vector implementation from log1p, which has been reused (with some modification for improved accuracy close to 0) from Neon atanh. Accurate to 3.5 ULP.
2023-01-05	pl/math: Add vector/Neon acoshf	Joe Ramsay
	New routine uses inlined log1pf helper, and is accurate to 3.1 ULP (2.8 ULP if fp exceptions are enabled).
2023-01-05	pl/math: Add scalar & vector/Neon tanh	Joe Ramsay
	New routines use the same algorithm, reliant on a modified version of expm1, and are accurate to 3 ULP.
2022-12-30	pl/math: Add vector/SVE log2	Pierre Blanchard
	The new SVE implementation is a direct port of Neon log2, and is accurate to 2.58 ULPs. Update error threshold and comments for Neon log2 too, new approximate argmax but same threshold.
2022-12-30	pl/math: Add vector/SVE log2f	Pierre Blanchard
	New SVE routine is an SVE port of the Neon algorithm and is accurate to 2.48 ULPs.
2022-12-22	pl/math: Add scalar & vector/Neon atanh	Joe Ramsay
	New routines are both based on existing log1p routines. Scalar is accurate to 3 ULP, Neon to 3.5 ULP. Both set fp exceptions correctly regardless of build config.
2022-12-22	pl/math: Add scalar atan and set fenv in Neon atan	Joe Ramsay
	The simplest way to set fenv in Neon atan is by using a scalar fallback for under/overflow cases, however this routine did not have a scalar counterpart so we add a new one, based on the same algorithm and polynomial as the vector variants, and accurate to 2.5 ULP. This is now used as the fallback for all lanes, when any lane of the Neon input is special.
2022-12-22	pl/math: Fix fp exceptions in Neon sinhf and sinh	Joe Ramsay
	Both routines previously relied on the vector expm1(f) routine exposed by the library, which depended on WANT_SIMD_EXCEPT for its fenv behaviour, however both routines were expected to always trigger fp exceptions correctly. To remedy this, both routines now use an inlined helper for expm1 (reused from vector tanhf in the case of sinhf), and special-case small input as well as large when WANT_SIMD_EXCEPT is enabled.
2022-12-20	Correct exit code from runulp.sh	Joe Ramsay
	The pipe prevented FAILs and PASSs being counted properly - the while read loop has been rewritten without a pipe, as it was prior to the changes here. fenv checking is temporarily disabled in Neon sinh and sinhf, as they do not get it right. This will be re-enabled once they have been fixed.
2022-12-20	pl/math: Update ULP threshold for SVE erf	Pierre Blanchard
	Updated comment and test threshold.
2022-12-20	pl/math: Add scalar atanf and set fenv in Neon atanf	Joe Ramsay
	The simplest way to set fenv in Neon atanf is by using a scalar fallback to under/overflow cases, however this routine did not have a scalar counterpart so we add a new one, based on the same algorithm and polynomial as the vector variants, and accurate to 2.9 ULP. This is now used as the fallback for all lanes, when any lane of the Neon input is special.
2022-12-20	pl/math: Add scalar & vector/Neon cbrt	Joe Ramsay
	New routines use the same algorithm, with simplified argument reduction and recombination in the vector variant. Both are accurate to 2 ULP.
2022-12-19	pl/math: Update ULP threshold for Neon asinh	Joe Ramsay
	New max observed - updated filenames, comments and runulp threshold.
2022-12-19	pl/math: Replace WANT_ERRNO with WANT_SIMD_EXCEPT for Neon fenv	Joe Ramsay
	We were previously misusing the WANT_ERRNO build flag. This is now replaced everywhere appropriate with WANT_SIMD_EXCEPT. A small number of vector routines get fp exceptions right with no modification - the tests have been updated to track this.
2022-12-19	pl/math: Improve vector/Neon log2f	Pierre Blanchard
	A new implementation based on the same approach as Neon logf, that is accurate to 2.48 ULPs. Flags set correctly regardless of WANT_ERRNO.
2022-12-15	pl/math: Move test intervals to routine source files	Joe Ramsay
	To conclude the work on simplifying the runulp.sh script, a new macro has been introduced to specify the intervals in which a routine should be tested in the routine source. This is eventually consumed by runulp.sh.
2022-12-15	pl/math: Move fenv expectations out of runulp.sh	Joe Ramsay
	Introduces a new macro, similar to how ULP thresholds are now handled, that emits a list of routines which are expected to correctly trigger fenv exceptions, to be consumed by runulp.sh. All scalar routines are expected to do so. A small number of Neon routines are also expected to, dependent on WANT_ERRNO.
2022-12-15	pl/math: Move ULP limits to routine source files	Joe Ramsay
	Introduces a new set of macros and Make rules for mechanically generating a list of ULP limits for each routine, to be consumed by runulp.sh. This removes the need to maintain long lists of thresholds in runulp.sh.
2022-12-15	pl/math: Auto-generate mathbench and ulp headers	Joe Ramsay
	Instead of maintaining three separate lists of routines, which are cumbersome and prone to merge conflicts, we provide a new macro, PL_SIG, which by some preprocessor machinery outputs the lists in the required format (macro formats have been changed very slightly to make the generation simpler). Only routines with simple signatures are handled - binary functions still need mathbench wrappers defined manually. As well, routines with non-standard references (i.e. powi/powk) still need entries and wrappers manually defined.
2022-12-15	Merge "optimized-routines-mem: Share with bionic" am: 04d0bc466c am: ↵	Treehugger Robot
	edaf4d6088 am: 0bc581b0e4 Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2338622 Change-Id: Id7177c8b85844f9ce0b189a5b4a72cbc85073daf Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-12-15	Merge "optimized-routines-mem: Share with bionic" am: 04d0bc466c am: edaf4d6088	Treehugger Robot
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2338622 Change-Id: I1007ac06e0383b6b89cec2d0a3e73f001fb80a7c Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-12-15	Merge "optimized-routines-mem: Share with bionic" am: 04d0bc466c	Treehugger Robot
	Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2338622 Change-Id: Iac9d067e65a1452f7f3447795d80378af429e4a8 Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-12-15	Merge "optimized-routines-mem: Share with bionic"main-16k-with-phones	Treehugger Robot

2022-12-13	pl/math: Set fenv flags in Neon log1p	Joe Ramsay
	New behaviour is hidden behind WANT_ERRNO config option.
2022-12-13	pl/math: Set fenv flags in Neon tanf	Joe Ramsay
	New behaviour is hidden behind WANT_ERRNO config option.
2022-12-13	pl/math: Set fenv flags in Neon log2f	Joe Ramsay
	Flags set correctly regardless of WANT_ERRNO.
2022-12-13	pl/math: Update ULP threshold for SVE atan2	Pierre Blanchard
	Test threshold fixed.
2022-12-13	pl/math: Set fenv flags in Neon log1pf	Joe Ramsay
	New behaviour is hidden behind WANT_ERRNO config option.
2022-12-09	optimized-routines-mem: Share with bionic	Pierre-Clément Tosi
	Migrate the libc dependencies for Rust to Bionic but keep using this library as a back-end for arm64 so update the visibility to point to libc instead of client code directly. Test: m pvmfw_bin && atest vmbase_example.integration_test Change-Id: I0e8b2ef862fcb47fcb66494aa180a7e66575a0a7
2022-12-09	pl/math: Add polynomial helpers	Joe Ramsay
	Add macros for simplifying polynomial evaluation using either Horner, pairwise Horner or Estrin. Several routines have been modified to use the new helpers. Readability is improved slightly, and we expect that this will make prototyping new routines simpler.