aboutsummaryrefslogtreecommitdiff
path: root/math/pow_log_data.c
AgeCommit message (Collapse)Author
2018-11-22Relicence the project under the MIT LicenseSzabolcs Nagy
2018-09-05Document the log table generation methodSzabolcs Nagy
Add comments with enough detail so the log lookup tables can be recreated.
2018-06-29Fix GNU style issuesSzabolcs Nagy
Whitespace changes only.
2018-06-22Improve pow implementationSzabolcs Nagy
The log part of pow got rewritten to use a slightly different algorithm. This improves precision and throughput while keeps the same table size. Near 1 cases are no longer special cased, there is a slight performance regression in that case. And when the fma instruction is not available this algorithm is expected to have slightly worse performance. Worst-case error improved from 0.67 ULP to 0.57 ULP. On Cortex-A72 i see thruput near 1: 7% worse latency near 1: 2% worse thruput general: 8% better latency general: 2% better
2018-06-11Add new pow implementationSzabolcs Nagy
The algorithm is exp(y * log(x)), where log(x) is computed with about 1.8*2^-66 relative error, returning the result in two doubles, and the exp part uses the same algorithm (and lookup tables) as exp, but takes the input as two doubles and a sign (to handle negative bases with odd integer exponent). There is separate code path when fma is not available but the worst case error is about 0.67 ULP in both cases. The lookup table and consts for log are 4224 bytes, the code is 1196 bytes. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: latency: 1.8x thruput: 2.5x