Age | Commit message (Collapse) | Author |
|
The outgoing license was MIT only. The new dual license allows
using the code under Apache-2.0 WITH LLVM-exception license too.
|
|
The code relied on the final x + c*x to be done via an fma, otherwise
the intermediate c*x could underflow for tiny (almost subnormal) x.
Use explicit fmaf like elsewhere (this code is not expected to be
fast when fma is not inlined, but at least it should be correct).
|
|
Only tested in round-to-nearest mode. The expected worst case error
is 1.01 ULP near x=1.25. Benchmarked over random x in [-6,6] and
can increase performance by > 2x (> 3.5x for throughput) on big ooo
cores compared to the implementation in glibc 2.28.
Includes data for erfc too, but this patch only adds erf.
|