diff options
author | Jonathan Wright <jonathan.wright@arm.com> | 2023-01-18 00:32:45 +0000 |
---|---|---|
committer | Wan-Teh Chang <wtc@google.com> | 2023-01-20 01:40:42 +0000 |
commit | 042e90c941026cd14d49ef853d3373d569ff6e36 (patch) | |
tree | abf68f87ca58c8bbde4364243abc925fc0f8e02b | |
parent | dd1659e492c36fdee7433abf4e0a823609643cd2 (diff) | |
download | libaom-042e90c941026cd14d49ef853d3373d569ff6e36.tar.gz |
Fix buffer overrun in dist_wtd_convolve_2d_horiz_neon
When introducing the 6-tap specialization for
dist_wtd_convolve_2d_vert_neon[1], we attempted to reduce the number
of rows processed in the horizontal convolution (by 2) if the
subsequent vertical convolution would be using a 6-tap filter instead
of an 8-tap filter. This logic was faulty and meant we ended up
accessing memory outside of the (padded) source buffer - since the
horizontal convolution processes 4 rows of data per iteration, when
the number of rows to be processed is not necessarily a multiple of
4.
This patch restores the previous logic and src pointer starting
position for dist_wtd_convolve_2d_horiz_neon.
[1] Commit hash: be1d80024928684b1c9eebc648ed92d8ea70d166
Bug: aomedia:3367
Change-Id: Id11d4ae7c123957d3336b61f4d4cc8da09131b68
(cherry picked from commit 81da208fb6e0a1fa6a3714a7b7dbe3f613e8bc64)
-rw-r--r-- | av1/common/arm/jnt_convolve_neon.c | 11 |
1 files changed, 6 insertions, 5 deletions
diff --git a/av1/common/arm/jnt_convolve_neon.c b/av1/common/arm/jnt_convolve_neon.c index 700dc546d..36c8f9cb8 100644 --- a/av1/common/arm/jnt_convolve_neon.c +++ b/av1/common/arm/jnt_convolve_neon.c @@ -1163,9 +1163,9 @@ void av1_dist_wtd_convolve_2d_neon(const uint8_t *src, int src_stride, const int y_filter_taps = get_filter_tap(filter_params_y, subpel_y_qn); const int clamped_y_taps = y_filter_taps < 6 ? 6 : y_filter_taps; - const int im_h = h + clamped_y_taps - 1; + const int im_h = h + filter_params_y->taps - 1; const int im_stride = MAX_SB_SIZE; - const int vert_offset = clamped_y_taps / 2 - 1; + const int vert_offset = filter_params_y->taps / 2 - 1; const int horiz_offset = filter_params_x->taps / 2 - 1; const int round_0 = conv_params->round_0 - 1; const uint8_t *src_ptr = src - vert_offset * src_stride - horiz_offset; @@ -1182,9 +1182,10 @@ void av1_dist_wtd_convolve_2d_neon(const uint8_t *src, int src_stride, dist_wtd_convolve_2d_horiz_neon(src_ptr, src_stride, im_block, im_stride, x_filter, im_h, w, round_0); - if (clamped_y_taps <= 6) { - dist_wtd_convolve_2d_vert_6tap_neon(im_block, im_stride, dst8, dst8_stride, - conv_params, y_filter, h, w); + if (clamped_y_taps == 6) { + dist_wtd_convolve_2d_vert_6tap_neon(im_block + im_stride, im_stride, dst8, + dst8_stride, conv_params, y_filter, h, + w); } else { dist_wtd_convolve_2d_vert_8tap_neon(im_block, im_stride, dst8, dst8_stride, conv_params, y_filter, h, w); |