external/arm-optimized-routines.git

Age	Commit message (Collapse)	Author
2020-11-13	math: fix spurious underflow in erff and erf	Szabolcs Nagy
	The code relied on the final x + cx to be done via an fma, otherwise the intermediate cx could underflow for tiny (almost subnormal) x. Use explicit fmaf like elsewhere (this code is not expected to be fast when fma is not inlined, but at least it should be correct).
2020-11-13	math: fix erf tests in directed rounding modes	Szabolcs Nagy
	erf has larger than 1 ULP errors in directed rounding modes, increase the error threshold to 1.4 ULP in the test script.
2020-11-13	math: add scalar erf	Pierre Blanchard
	Only tested in round-to-nearest mode. The expected worst case error is 1.01 ULP near x=1.25. Benchmarked over random x in [-6,6] and can increase performance by > 2x (> 3.5x for throughput) on big ooo cores compared to the implementation in glibc 2.28. Includes data for erfc too, but this patch only adds erf.
2020-11-05	math: Fix copyright header in erff.tst	Szabolcs Nagy
	This was incorrect in the previous commit.
2020-10-29	math: add scalar erff.	Pierre Blanchard
	In round-to-nearest mode the maximum error is 1.09 ULP. Compared to glibc-2.28 erff: throughput is about 2.2x better, latency is about 1.5x better on some AArch64 cores (on random input in [-4,4]). There are further optimization and quality improvement opportunities.
2020-02-27	math: Improve comments, disable errno handling by default	Wilco Dijkstra
	Improve comments in math_config.h. Set WANT_ERRNO to 0 by default.
2020-01-14	math: Add more ulp tests	Szabolcs Nagy
	Some functions were not tested with the statistical ulp error check tool, this commit adds tests for the current math symbols.
2020-01-14	math: add vector pow	Szabolcs Nagy
	This implementation is a wrapper around the scalar pow with appropriate call abi. As such it is not expected to be faster than scalar calls, the new double prec vector pow symbols are provided for completeness.
2020-01-09	math: fix spurious overflow in pow with clang	Szabolcs Nagy
	clang does not support c99 fenv_access and may move fp operations out of conditional blocks causing unconditional fenv side-effects. Here if (cond) ix = f (x * 0x1p52); was transformed to ix_ = f (x * 0x1p52); ix = cond ? ix_ : ix; where x can be a huge negative value so the mul overflows. The added barrier should prevent such transformation by significantly increasing the cost of doing the mul unconditionally. Found by enh from google on android arm and aarch64 targets. Fixes github issue #16.
2019-11-22	Build system refactoring	Szabolcs Nagy
	Reorganise the makefiles so subprojects can be more separately used and maintained. Still kept the single toplevel Makefile and config.mk. Subproject Dir.mk is expected to provide all-X, check-X, clean-X and install-X targets where X is the subproject name and it may use generic make variables set in config.mk, like CFLAGS_ALL and CC, or subproject specific variables like X-cflags.
2019-11-06	math: add WANT_VMATH feature macro	Szabolcs Nagy
	When defined as 0 the vector math code is not built and not tested.
2019-11-06	math: allow errno setting even if !(math_errhandling&MATH_ERRNO)	Szabolcs Nagy
	The math_errhandling checks are incorrect in general: it is defined by the libc math.h which is not appropriate for optimized-routines provided functions that we are testing. However even if we want to test a libc implementation, ISO C allows the setting of errno even if !(math_errhandling&MATH_ERRNO), so relax the checks.
2019-11-06	math: fix unused function warnings in mathbench.c	Szabolcs Nagy
	Vector functions are only used on aarch64, so only define them there. math/test/mathbench.c:95:1: warning: '__v_dummyf' defined but not used [-Wunused-function]
2019-11-06	math: fix missing attributes warnings	Szabolcs Nagy
	gcc-9 started warning if alias symbols have different attributes: math/expf.c: At top level: math/expf.c:89:21: warning: '__expf_finite' specifies less restrictive attributes than its target 'expf': 'leaf', 'nothrow', 'pure' [-Wmissing-attributes] so copy the attributes when creating the aliases.
2019-11-06	math: fix unused variable warnings	Szabolcs Nagy
	Compilers (incorrectly) warn about unused volatile variables: math/math_config.h: In function 'force_eval_float': math/math_config.h:188:18: warning: unused variable 'y' [-Wunused-variable] silence them.
2019-11-06	math: move definitions in internal header	Szabolcs Nagy
	Compiler checks and realated macros need to be done earlier so they are usable for the static inline functions.
2019-11-05	Add vector exp2f	Szabolcs Nagy
	Same design as in expf. Worst-case error of __v_exp2f and __v_exp2f_1u is 1.96 and 0.88 ulp respectively. It is not clear if round/convert instructions are better or +- Shift. For expf the latter, for exp2f the former seems more consistently faster, but both options are kept in the code for now.
2019-11-05	math: fix runulp.sh	Szabolcs Nagy
	Use heredoc instead of pipe when iterating over test cases to avoid creating a subshell that would break the PASS/FAIL accounting.
2019-10-17	fix the build of s_powf.o on non-aarch64 targets	Szabolcs Nagy
	This is a simple fix to the v_powf code, but in general the vector code may not work on arbitrary targets even when compiled with scalar types (s_powf.c), so in the long term may be all s_* should be disabled for non-aarch64 targets (requires test system and header changes too).
2019-10-14	Add vector log	Szabolcs Nagy
	Worst-case error is 1.67 ulp, the polynomial was generated by sollya. Uses a 128 entry (2KB) lookup table. Special cases fall back to scalar log call.
2019-10-14	Add vector sin and cos	Szabolcs Nagy
	Worst-case error is 3.5 ulp, the polynomial was generated by sollya. For large (>2^23) and special inputs the code falls back to scalar sin and cos.
2019-10-14	Add vector powf	Szabolcs Nagy
	Essentially the scalar powf algorithm is used for each element in the vector just inlined for better scheduling and simpler special case handling. The log polynomial is smaller as less accuracy is enough. Worst-case error is 2.6 ulp.
2019-10-14	Add vector sinf and cosf	Szabolcs Nagy
	The polynomials were produced by searching the coefficient space using heuristics and ideas from https://arxiv.org/abs/1508.03211 The worst-case error is 1.886 ulp, large inputs (> 2^20) and other special cases use scalar sinf and cosf.
2019-10-14	Add vector logf	Szabolcs Nagy
	The polynomial was produced by searching the coefficient space using heuristics and ideas from https://arxiv.org/abs/1508.03211 The worst-case error is 3.34 ulp, subnormal range inputs and other special cases use scalar logf.
2019-10-14	Add vector exp, expf and related vector math support code	Szabolcs Nagy
	Vector math routines are added to the same libmathlib library as scalar ones. The difficulty is that they are not always available, the external abi depends on the compiler version used for the build. Currently only aarch64 AdvSIMD is supported, there are 4 new sets of symbols: __s_foo is a scalar function with identical result to the vector one, __v_foo is a vector function using the base PCS, __vn_foo uses the vector PCS and _ZGV*_foo is the vector ABI symbol alias of vn_foo for a scalar math function foo. The test and benchmark code got extended to handle vector functions. Vector functions aim for < 5 ulp worst case error, only support nearest rounding mode and don't support floating-point exceptions. Vector functions may call scalar functions to handle special cases, but for a single value they should return the same result independently of values in other vector lanes or the position of the value in the vector. The __v_expf and __v_expf_1u polynomials were produced by searching the coefficient space with some heuristics and ideas from https://arxiv.org/abs/1508.03211 Their worst case error is 1.95 and 0.866 ulp respectively. The exp polynomial was produced by sollya, it uses a 128 element (1KB) lookup table and has 2.38 ulp worst case error.
2019-10-14	math: more robust mathbench_libc	Szabolcs Nagy
	Not all symbols referenced by mathbench may be available in libc so link to libmathlib too to resolve the missing symbols.
2019-10-14	math: update the plot script	Szabolcs Nagy
	Fix it to be python3 compatible and plot the exact and approximated values too.
2019-10-14	Prevent fenv access breaking optimizations of the ulp tool	Szabolcs Nagy
	The ulp tool compares output of a math function to a larger precision implementation of the same function. But when the input argument is converted to a larger precision number the signaling nan property is lost, so ensure that the conversion happens inside the critical region where fenv exceptions are checked and then the conversion itself will raise the invalid exception, which is the correct behaviour in most cases. The volatile barrier is not perfect and the snan behaviour is not always signaling, but this should give more reliable results in most cases than before.
2019-10-08	Support running the math tests without fenv checks	Szabolcs Nagy
	fenv support is not reliable in clang so provide a mechanism to disable fenv status checks and only check the result values.
2019-10-08	Support separate CFLAGS per subproject	Szabolcs Nagy
	Users may want different CFLAGS for math and string subprojects, expose a mechanism for this in config.mk.
2019-10-08	fix exit status of ulp and runulp.sh	Szabolcs Nagy
	Make ulp and runulp.sh fail on error.
2019-10-08	fix mathtest lineno accounting	Szabolcs Nagy
	Only increment once per fgets.
2019-10-08	fix the exit status of mathtest on failure	Szabolcs Nagy
	Make mathtest fail on error so make check fails too.
2019-08-28	math: add long double symbols when long double is same as double	Szabolcs Nagy
	To allow libmathlib.a to be a drop-in replacement for libc functions on 32bit arm, we should provide long double symbols otherwise the libc long double implementation may pull in double symbols that can conflict with libmathlib.a in case of static linking. Using wrappers instead of alias to avoid type declaration conflicts.
2019-08-27	math: fix duplicated declaration in mathlib.h	Szabolcs Nagy
	Removed tanf declaration since the implementation got removed too.
2019-08-08	Do not return failure if MPFR is not supported	Adhemerval Zanella
	It avoids the 'make check' to abort when building on architectures that has LDBL_MANT_DIG == DBL_MANT_DIG without the mpfr installed (arm cross-compiling for instance).
2019-07-22	Add ULP error plot script	Szabolcs Nagy
	Simple script to visualize the output of the ulp test tool, e.g. $ build/bin/ulp -e .0001 log 0.5 2.0 2345678 \| math/tools/plot.py
2019-07-22	Reorganize the directory layout	Szabolcs Nagy
	To allow subprojects other than math, the build system and directory layout is changed: all math related code, tools and tests are under the math directory now, new subprojects should be similarly self- contained. The top level Makefile design is still kept, but the subproject build directories provide their own Dir.mk with the build rules for the subproject. The user interface of config.mk is kept for now, in the future subproject specific flags and make variables may be added for finer grained control.
2019-07-18	Remove math/single and rem_pio2	Szabolcs Nagy
	math/single contained code for systems without double precision fpu and rem_pio2 is not used currently and likely will be designed differently when double precision trigonometric functions are added.
2019-07-01	__ieee754_rem_pio2: store 32 more bits of 2/pi	Simon Tatham
	When range-reducing a value of the maximum double precision exponent, it was possible to read one word past the end of the const data array. The error introduced was negligible, but of course overrunning an array is a bug regardless.
2018-12-10	Change the powf overflow handling	Szabolcs Nagy
	This fix is slightly more correct than the previous one (which introduced a 1ulp error when rounding toward zero near the overflow limit) and has slightly smaller code size.
2018-12-07	More consistent excess precision handling	Szabolcs Nagy
	The current code aims to support FLT_EVAL_METHOD!=0 targets (such as i386 with x87 fpu or m68k) assuming appropriate narrowing eval functions are defined for them. But the narrowing eval functions were not used consistently: the return statement may not guarantee narrowing (e.g. that was the C99 behaviour which got changed in C11 annex F) so we should use the narrowing eval_as_ functions at return statements too. Results should be correct if narrowing only happens at eval_as_ calls. On most targets this change has no effect because eval_as_ is a noop. Most math implementations that care about excess precision already compile in a mode that narrows at returns so this change is not necessary for them, just better documents the assumptions.
2018-12-07	Fix powf overflow handling in non-nearest rounding mode	Szabolcs Nagy
	The threshold value at which powf overflows depends on the rounding mode and the current check did not take this into account. So when the result was rounded away from zero it could become infinity without setting errno to ERANGE. Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set, which is fine in nearest rounding mode, but powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode and the previous implementation did not set errno in these cases. This special case is fixed without affecting the common code path by checking the rounding mode and using appropriate threshold value. Arithmetics is used to check the rounding mode to avoid introducing a stack frame when calling fegetround. Unfortunately the current test system does not support rounding modes. (Found by comparing against musl powf behaviour on the libc-test test suite, unfortunately that test system does not support errno.)
2018-11-22	Relicence the project under the MIT License	Szabolcs Nagy

2018-09-18	Fix the documentation comment of checkint	Szabolcs Nagy
	checkint in pow is not supposed to be used with 0, inf or nan inputs.
2018-09-05	Document the log table generation method	Szabolcs Nagy
	Add comments with enough detail so the log lookup tables can be recreated.
2018-08-17	Ensure HAVE_FAST_FMA is set on AArch64	Wilco Dijkstra
	If math.h doesn't set FP_FAST_FMA correctly, ensure HAVE_FAST_FMA is set on AArch64.
2018-08-08	Improve sincosf comments	Wilco Dijkstra
	Improve comments. Use TOINT_INTRINSICS rather than HAVE_FAST_ROUND.
2018-07-30	Don't build tanf rredf and funder by default	Szabolcs Nagy
	These are no longer maintained and only kept for WANT_SINGLEPREC build, which is useful for microcontrollers with single precision fpu only. Removed all tests that are not testing code in the default build.
2018-07-10	Remove float compare option from sincosf	Szabolcs Nagy
	PREFER_FLOAT_COMPARISON setting was not correct as it could raise spurious exceptions. Fixing it is easy: just use ISLESS(x, y) instead of abstop12(x) < abstop12(y) with appropriate non-signaling definition for ISLESS. However it seems this setting is not very useful (there is only minor performance difference on various architectures), so remove this option for now.