diff options
author | Szabolcs Nagy <szabolcs.nagy@arm.com> | 2018-05-10 15:35:06 +0100 |
---|---|---|
committer | Szabolcs Nagy <szabolcs.nagy@arm.com> | 2018-05-16 13:52:13 +0100 |
commit | 39b0191da32e494ca7d1e9991614d03f3126fd9f (patch) | |
tree | 93b7ae9ee8fa1ae936521da32fe0a982ab09b971 /config.mk.dist | |
parent | 0d51c0453a46a76dcf3bfd1820de7e2ffbd4b751 (diff) | |
download | arm-optimized-routines-39b0191da32e494ca7d1e9991614d03f3126fd9f.tar.gz |
Clean up roundtoint and converttoint
Ideally single instruction is used for rounding and conversion that
round to the nearest integer independently of the current rounding mode,
otherwise the argument reduction in expf is not reducing into the
optimal range in non-nearest rounding mode.
AArch64 has the necessary instructions, both for rounding ties away from
zero (round function in C) and rounding ties to even (roundeven function
in TS 18661), originally the later was used, but the former can be
expressed in portable C code without relying on ACLE intrinsics.
A bit of complication is that round and lround are not always inlined as
a single instruction:
- gcc inlines lround with -fno-math-errno, but fails to inline
(long)round as a single instruction (at least up to gcc-8).
- clang inlines (long)round, but not lround.
Portable code is still better than relying on arm_neon.h, so use the
round function when it works and keep the shift based rounding as
fallback (which only gives precise results in nearest rounding mode,
but is the optimal implementation for most targets).
HAVE_FAST_ROUND and HAVE_FAST_LROUND are set based on preprocessor
heuristics (user can override them with CFLAGS) for now, once there is
a configure script we can detect these at configure time.
Diffstat (limited to 'config.mk.dist')
-rw-r--r-- | config.mk.dist | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/config.mk.dist b/config.mk.dist index a17fa9d..b616a65 100644 --- a/config.mk.dist +++ b/config.mk.dist @@ -25,7 +25,7 @@ CFLAGS += -Wall -Wno-missing-braces -Wno-strict-aliasing -Wno-unused-function # Use with gcc. CFLAGS += -frounding-math -fexcess-precision=standard -fno-stack-protector -CFLAGS += -ffp-contract=fast +CFLAGS += -ffp-contract=fast -fno-math-errno # Use with clang. #CFLAGS += -DCLANG_EXCEPTIONS |