Age | Commit message (Collapse) | Author |
|
Scripted copyright year updates based on git committer date.
|
|
In most cases, we mask lanes which should not trigger exceptions with
a neutral value, then let the existing special-case handler fix them
up later. For exp and exp2 we replace the more complex special-case
handler with a simple scalar fallback. All new behaviour is tested in
runulp.sh, with a new option to pass -f to the run line. We also
extend the fenv testing to Neon log and logf, which already triggered
exceptions correctly. New behaviour is mostly hidden behind a new
config setting, WANT_SIMD_EXCEPT.
|
|
The outgoing license was MIT only. The new dual license allows
using the code under Apache-2.0 WITH LLVM-exception license too.
|
|
Same design as in expf. Worst-case error of __v_exp2f and __v_exp2f_1u
is 1.96 and 0.88 ulp respectively.
It is not clear if round/convert instructions are better or +- Shift.
For expf the latter, for exp2f the former seems more consistently
faster, but both options are kept in the code for now.
|
|
Vector math routines are added to the same libmathlib library as scalar
ones. The difficulty is that they are not always available, the external
abi depends on the compiler version used for the build. Currently only
aarch64 AdvSIMD is supported, there are 4 new sets of symbols:
__s_foo is a scalar function with identical result to the vector one,
__v_foo is a vector function using the base PCS,
__vn_foo uses the vector PCS and
_ZGV*_foo is the vector ABI symbol alias of vn_foo
for a scalar math function foo.
The test and benchmark code got extended to handle vector functions.
Vector functions aim for < 5 ulp worst case error, only support nearest
rounding mode and don't support floating-point exceptions. Vector
functions may call scalar functions to handle special cases, but for a
single value they should return the same result independently of values
in other vector lanes or the position of the value in the vector.
The __v_expf and __v_expf_1u polynomials were produced by searching the
coefficient space with some heuristics and ideas from
https://arxiv.org/abs/1508.03211
Their worst case error is 1.95 and 0.866 ulp respectively.
The exp polynomial was produced by sollya, it uses a 128 element (1KB)
lookup table and has 2.38 ulp worst case error.
|