Age | Commit message (Collapse) | Author |
|
Scripted copyright year updates based on git committer date.
|
|
In most cases, we mask lanes which should not trigger exceptions with
a neutral value, then let the existing special-case handler fix them
up later. For exp and exp2 we replace the more complex special-case
handler with a simple scalar fallback. All new behaviour is tested in
runulp.sh, with a new option to pass -f to the run line. We also
extend the fenv testing to Neon log and logf, which already triggered
exceptions correctly. New behaviour is mostly hidden behind a new
config setting, WANT_SIMD_EXCEPT.
|
|
This makes it easier for users to toggle errno off and on, and also
makes it possible to toggle the behaviour of our tests depending on
whether we expect errno to be set properly or not.
|
|
This is required when running a `make check`, in order to avoid
running ulp tests on SVE routines when SVE is disabled.
Keeping the definition of cflags for SVE in the config file to
allow user control over `-march`.
|
|
Enable SVE compilation by uncommenting following line in config.mk:
math-cflags += -march=armv8.2-a+sve -DWANT_SVE_MATH=1
|
|
- pl/ is built from top-level Makefile by adding pl to SUBS
- PLSUBS lists all pl/ subdirectories that can be built,
it only contains math for now. Please modify this list in
the top-level config.mk.
- pl libraries and infrastructure is built in build/pl/
- As a result math/ and pl/math generate separate test and bench
binaries.
- Use infrastructure provided in math/test to test and profile
pl/math routines. The build system ensures the appropriate
header files are first copied to build/pl/include/test to define
wrappers and entries in ulp and mathbench.
- Copyied scalar erff from math/ to pl/math/ to show build
system is functional.
- pl mathlib libraries are built separately to the main/portable
mathlib libraries and installed alongside.
|
|
The outgoing license was MIT only. The new dual license allows
using the code under Apache-2.0 WITH LLVM-exception license too.
|
|
Scripted copyright year updates based on git committer date.
|
|
Set taggs for every test case so that boundaries are as narrow as
possible. There is no handling of tag faults, so the test will
crash if there is a MTE problem.
The implementations that are not compatible are excluded, including
the standard symbols that may come from an mte incompatible libc.
|
|
GNU Property Notes are only supported in recent tooling and older
tools may warn about them, so it makes sense to remove these notes
on a system where BTI is not supported anyway.
The actual BTI instructions should be kept in place to avoid
disturbing code layout.
-DWANT_GNU_PROPERTY=0 removes the .note.gnu.property section
from assembly files (ideally it would be based on the compiler
default setting, but there is no feature test macro for BTI and
PAC-RET).
|
|
Add scalar and NEON ones' complement checksumming implementations for
AArch64 and Armv7-A.
|
|
Including multiple asm source files into a single top level file
can cause problems, this can be fixed by having one top level
file per target specific source file, but for maintenance and
clarity it's better to use the sub directory structure for selecting
which files to build.
This requires a new ARCH make variable setting in config.mk which
must be consistent with the target of CC.
Note: the __ARM_FEATURE_SVE checks are moved into the SVE asm code.
This is not entirely right: the feature test macro is for ACLE, not
asm support, but this patch is not supposed to change the produced
binaries and some toolchains (e.g. older clang) does not support SVE
instructions. The intention is to remove these checks eventually
and always build all asm code and only support new toolchains (the
test code will only test the SVE variants if there is target support
for it though).
|
|
Reorganise the makefiles so subprojects can be more separately used and
maintained. Still kept the single toplevel Makefile and config.mk.
Subproject Dir.mk is expected to provide all-X, check-X, clean-X and
install-X targets where X is the subproject name and it may use generic
make variables set in config.mk, like CFLAGS_ALL and CC, or subproject
specific variables like X-cflags.
|
|
When defined as 0 the vector math code is not built and not tested.
|
|
Implicit function declaration is always a bug, but compilers don't
turn it into an error by default for historical reasons, so add it
to the default config.
|
|
fenv support is not reliable in clang so provide a mechanism to
disable fenv status checks and only check the result values.
|
|
Users may want different CFLAGS for math and string subprojects, expose
a mechanism for this in config.mk.
|
|
Allows optimizing the code in shared libraries differently.
Has significant effect on literal loads in simd code.
|
|
math/single contained code for systems without double precision fpu
and rem_pio2 is not used currently and likely will be designed
differently when double precision trigonometric functions are added.
|
|
Check ULP error by random sampling and comparing against a higher
precision implmenetation.
This is similar to the randomized tests in the mathtest code, but it
runs on the target only and is much faster to allow exhaustive ULP
checks for single precision functions. It also supports non-nearest
rounding modes.
The ULP error is reported in an unconventional way: instead of the
difference between the observed rounded result and accurate result, the
minimum error is reported that makes the accurate result round to the
observed result. This is more useful for comparing errors across
different rounding modes. In nearest-rounding mode usually 0.5 has to
be added to the reported error to get the conventional ULP error.
The code optionally depends on mpfr. On targets where double has the
same format as long double, mpfr is required for testing double
precision functions. By default there is no dependency on mpfr on the
target, to use mpfr add -DUSE_MPFR to the CFLAGS and -lmpfr to LDLIBS.
ucheck is a new make target for running ulp error checks.
Typical usage and output of the new ulp tool:
$ build/bin/ulp -e .001 exp 1.0 2.0 12345
exp(0x1.79ef3658a63c9p+0) got 0x1.181caa32757a7p+2 want 0x1.181caa32757a6p+2 +0.499708 ulp err 0.000291756
exp(0x1.9c8a65340f80cp+0) got 0x1.40a8032e5f576p+2 want 0x1.40a8032e5f575p+2 +0.498903 ulp err 0.00109668
FAIL exp in [0x1p+0;0x1p+1] round n errlim 0.001 maxerr 0.00109668 +0.5 cnt 12345 cnt1 6 0.0486027% cnt2 0 0% cntfail 1 0.00810045%
Floating-point exceptions are not guaranteed to be reported accurately
and can be turned off by -f.
The implementation is generic over the argument types which complicates
the code, but at least the difficult inner loop logic is not repeated
many times this way.
To add support for a new function foo, the fun array needs to be updated
with an entry for foo. (This usually requires the functions foo, fool
and mpfr_foo to be defined.)
|
|
|
|
Don't use target flags when building host tools. The example config.mk.dist
is updated accordingly.
|
|
Old code used pointer based type punning which is considered an aliasing
violation by most compilers, while union based type punning is intended
to work (both by ISO C and most compilers).
|
|
Ideally single instruction is used for rounding and conversion that
round to the nearest integer independently of the current rounding mode,
otherwise the argument reduction in expf is not reducing into the
optimal range in non-nearest rounding mode.
AArch64 has the necessary instructions, both for rounding ties away from
zero (round function in C) and rounding ties to even (roundeven function
in TS 18661), originally the later was used, but the former can be
expressed in portable C code without relying on ACLE intrinsics.
A bit of complication is that round and lround are not always inlined as
a single instruction:
- gcc inlines lround with -fno-math-errno, but fails to inline
(long)round as a single instruction (at least up to gcc-8).
- clang inlines (long)round, but not lround.
Portable code is still better than relying on arm_neon.h, so use the
round function when it works and keep the shift based rounding as
fallback (which only gives precise results in nearest rounding mode,
but is the optimal implementation for most targets).
HAVE_FAST_ROUND and HAVE_FAST_LROUND are set based on preprocessor
heuristics (user can override them with CFLAGS) for now, once there is
a configure script we can detect these at configure time.
|
|
Some parts of the test code only worked on arm and aarch64 targets.
This patch tries to fix most portability issues: fix fenv api usage,
remove non-portable endianness check, rename identifiers colliding
with reserved POSIX symbol names.
Fix some uninitialized warnings too.
|
|
Add a single makefile such that it will rarely need modifications
with regular source code changes.
The usual configure step for build environment detection is done
manually by editing config.mk, it is expected to be simple make
variable changes (mostly CFLAGS), later a simple configure script
can be added to generate config.mk if necessary.
Update the README.
|