diff options
author | Benoit Jacob <benoitjacob@google.com> | 2019-08-20 10:25:05 -0400 |
---|---|---|
committer | Benoit Jacob <benoitjacob@google.com> | 2020-03-10 16:36:41 -0400 |
commit | fa69a4bbdf3b676156668842b5d2e042cd4cd1f7 (patch) | |
tree | 2f19d08f4e70d5ab400aa64d0eaf341fa299dc01 /BUILD | |
parent | 9a8ac17ea97b04776c6c0ab9f90ee4f9c3636afe (diff) | |
download | ruy-fa69a4bbdf3b676156668842b5d2e042cd4cd1f7.tar.gz |
Some more fixes to arm32 asm:
- Use vld1.8 not vld1.32 to load 8bit values. Especially in packing code,
the source pointers are not guaranteed so have any alignment. In kernels,
they are more or less guaranteed to be, but .8 is more idiomatic. If
we ever notice a performance benefit of .32 (news to me) justifying this
choice, we could then use .32 in kernels only and with a comment recording
the performance rationale.
- One vld1 was passing a single d-register without enclosing it in {} to make
it a register-list.
- Pack8bitNeonOutOfOrder{LHS,RHS} renamed to Pack8bitNeonOutOfOrder{4Cols,2Cols} because that's more descriptive of the actual difference between these functions.
PiperOrigin-RevId: 264378751
Diffstat (limited to 'BUILD')
0 files changed, 0 insertions, 0 deletions