external/arm-optimized-routines.git

Age	Commit message (Collapse)	Author
2022-02-10	Update lincense to MIT OR Apache-2.0 WITH LLVM-exception	Szabolcs Nagy
	The outgoing license was MIT only. The new dual license allows using the code under Apache-2.0 WITH LLVM-exception license too.
2018-11-22	Relicence the project under the MIT License	Szabolcs Nagy

2018-06-15	Fix spurious underflow in exp without fma	Szabolcs Nagy
	The last multiplication in exp and exp2 could underflow when it was not contracted into an fma. Changed the thresholds so the problematic cases end up in the specialcase code path (which handles underflow correctly). The initial check now only looks at the exponent bits which has slightly better performance on aarch64. The overflow threshold can be tight for exp2, but was let loose in exp so the specialcase handling got updated accordingly. Added comments about this issue and the assumptions exp_inline is making in pow.
2018-06-04	Add new exp and exp2 implementations	Szabolcs Nagy
	Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, the read only global data size is 2160 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Another extern API symbol was added: __exp_dd takes two doubles, the top and bottom part of the input, I expect it to be useful for implementing pow, but it might get removed or moved elsewhere later. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp latency: 1.5x exp thruput: 2.3x exp2 latency: 1.6x exp2 thruput: 3.2x