Age | Commit message (Collapse) | Author |
|
Add macros for simplifying polynomial evaluation using either Horner,
pairwise Horner or Estrin. Several routines have been modified to use
the new helpers. Readability is improved slightly, and we expect that
this will make prototyping new routines simpler.
|
|
Small simplification - pl routines do not support different rounding
modes, so there is no need to support them in runulp.sh. As a result
we can also remove Ldir.
|
|
Special lanes were not being properly masked when a lane was
tiny. This is now fixed.
|
|
Fixing a bug that resulted in potentially random results
in boring domain by saturating index at an appropriate value.
|
|
In most cases, we mask lanes which should not trigger exceptions with
a neutral value, then let the existing special-case handler fix them
up later. For exp and exp2 we replace the more complex special-case
handler with a simple scalar fallback. All new behaviour is tested in
runulp.sh, with a new option to pass -f to the run line. We also
extend the fenv testing to Neon log and logf, which already triggered
exceptions correctly. New behaviour is mostly hidden behind a new
config setting, WANT_SIMD_EXCEPT.
|
|
The branch out of the core memchr loop to label 60 jumps over the
popping of registers r4-r7. The restoration of the cfi state at 60 is
adjusted to reflect this fact, avoiding restoring a state where r4-r7
have already been popped off the stack.
Built w/ arm-none-linux-gnueabihf, ran make check-string w/ qemu-arm-static.
|
|
Move code fragment corresponding to L(fastpath_exit) to after function
entry so that a .cfi_remember_state/.cfi_restore_state pair are not
needed prior to strcmp start.
The resulting reshuffle of code cleans up the entry part, fixing the
.size directive calculation, which at present calculates the function
size based on the address of __strcmp_arm and not L(strcmp_start_addr).
|
|
Routine no longer relies on vector log1pf, as this has to become more
complex to deal with fenv itself. Instead we re-use a log1pf helper
from Neon atanhf which does no special-case handling, instead leaving
it all up to the main routine. We now just fall back to the scalar
routine for special-case handling. This uncovered a mistake in
asinhf's handling of NaNs, which has been fixed.
|
|
The ldexp shortcut was left-shifting a signed value. We now bias the
exponent first, will allows the shift to be done on an unsigned value.
|
|
Both routines use simplified inline versions of expm1f, and are
accurate to 2.6 ULP.
|
|
New routine uses two separate algorithms for input greater and less
than 1 (similar to the scalar routine). It is accurate to 2.5 ULP.
|
|
New max observed for both Neon and SVE.
|
|
Both routines use the same algorithm - one Newton iteration with the
initial guess obtained by a low-order polynomial. Scalar is used as a
fallback for subnormal and special cases for the vector routine, which
allows vastly simplified argument reduction and reassembly. Both
routines accurate to 1.5 ULP.
|
|
Both routines are based on a simplified version of log1pf, and are
accurate to 3.1 ULP. Also enabled -c flag from runulp.sh - we need
this for atanhf so that we can set the control lane to something other
than 1, since atanh(1) is infinite.
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572
Change-Id: I026f09de07f4abc5fb111ba3493bbee9bfc58ab7
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572
Change-Id: Ifdb9788e5c6758265ca04f50530c9656883bc222
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304572
Change-Id: I1c3f88174e00550ab598f22638364dde4fd79c96
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
am: 722a6d8de7
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835
Change-Id: Ia96ff6bbc0b2fb7c7478573c9585800fbb26a47e
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835
Change-Id: I1cc3d0c50479d1f86b761ff7f09ebb99390cdc1c
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2304835
Change-Id: Iafefe4c746d77725505fe364a96011b1c1c7bf7e
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Test: treehugger
Change-Id: Ibae0859e9683d10ba53113baeba26f720d44d674
|
|
The .fnstart/.fnend directives can be inlined now that asmdefs.h is
arm specific.
|
|
This is preprocessed asm code, so /**/ style comments are most
appropriate.
|
|
Currently this is not expected to change behaviour, but if global
directives are added in asmdefs.h (like .thumb) those should be in
all asm files in case the link ABI is affected.
|
|
The definitions in this header are necessarily target specific, so
better to have a separate version in each target directory.
|
|
asmdefs.h ifdef logic was wrong: arm only macro definitions were
outside of defined(__arm__).
Added some ifdef indentation to make the code more readable.
|
|
fv and dv are only declared under __aarch64__ - for other targets the
new -c option should be disabled.
|
|
New routines are based on double-precision exp, both accurate to 2 ULP.
|
|
New routines are based on the single-precision versions and are
accurate to 3 ULP.
|
|
Test: treehugger
Change-Id: I545117b6b8283f0b5eca2f4591579393720c7960
|
|
d85c7f4e12 am: dfd77cd57c
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393
Change-Id: I1b1d024d51f30ad9f686e7581eb15ce76016dbac
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
d85c7f4e12
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393
Change-Id: Ifeae3e8c493fd232b15fbe2127c749faef337b1d
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2236393
Change-Id: Iaf6a1e6449a778895618ba47eabed3fde7087fe5
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
|
|
am: 892ed4034a
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998
Change-Id: I3c887444b18412dd598a7f5ca5edacfb040e69a6
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998
Change-Id: I7760d1660e6a5f0d966f0d579986c11337210dbe
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Original change: https://android-review.googlesource.com/c/platform/external/arm-optimized-routines/+/2302998
Change-Id: Ia1b150c236a53799d031e1e45dc9c7dc001b748e
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
|
|
Reduce the order of the polynomial used in Neon log2 by one (from 7 to
6). In order to calculate the new coefficients required we rescale the
coefficients from log_data.c by log2(e) in extended precision and
round back.
The maximum observed error is unchanged (2.59 ULPs) but the point at
which it is observed has changed slightly.
|
|
argf and argd have been designed such that non-special input is
tested, optionally followed by a vector with one special lane. To be
able to test that vector functions have correct behaviour w.r.t. fenv
exceptions, we need to be able to choose a different value for the
last lane, as using 1 leads to false negatives when testing a function
for which 1 is a special value. We add an option, -c, for the user to
provide a different control value.
|
|
The file is a build configuration and shouldn't be executable.
Test: TH # No change intended
Change-Id: I76ab86cd2971160b7f376bdda0d59da36c50a59b
|
|
Add more libc helper functions, to be used by baremetal Rust targets.
Bug: 255521657
Test: atest vmbase_example.integration_test # used by aosp/2138640
Change-Id: I6ec50bc37d0851c5fd47902f34a25b6178e36ed3
|
|
There is collision for math-tests and math-rtests between math/ and
pl/math, which can lead to failures if running both concurrently. We
rename the pl-specific lists to avoid this.
|
|
Extra special-case check.
|
|
These were broken in the previous patch, now fixed.
|
|
Reduces the amount of boilerplate developers need to write for new
routines.
|
|
New routine is a vector port of the scalar algorithm, with fallback to
the scalar variant for large and special input. This enables us to
simplify elements of the algorithm which were necessary for large
input. It also means that, as long as we fall back to the scalar for
tiny input as well (dependent on the value of WANT_ERRNO), the routine
sets fenv flags correctly.
|
|
New routine uses the same algorithm as the single-precision routine,
and is accurate to 2.5 ULP.
|
|
New routines use single-precision exp, which has been copied from
math/. Scalar is accurate to 1.9 ULP, Neon to 2.4 ULP.
Also use the new expf helper in scalar sinhf.
|
|
We want these tests to pass regardless of whether the user has enabled
or disabled WANT_ERRNO - this is now supported by a WANT_ERRNO config
option, which will be added to config.mk.dist in a follow-on.
|
|
Also skips the test line when D is not "fenv" or empty, i.e. when it
is the SVE if statement. This used to work but was broken by adding
the D variable, so the tests did not run properly when WANT_SVE_MATH
was enabled. Now fixed.
|