diff options
author | Adenilson Cavalcanti <adenilson.cavalcanti@arm.com> | 2019-08-05 23:18:29 +0000 |
---|---|---|
committer | Commit Bot <commit-bot@chromium.org> | 2019-08-05 23:18:29 +0000 |
commit | 5cb718c7fcd502ab38c38ed0cebe5fdf3cdd3a47 (patch) | |
tree | 84e99feacf9bb2cbbad87dad7e3d2ffd54b596ba /compress.c | |
parent | 2b4888a46ae73eb1261efc924ac13fe2faa6c480 (diff) | |
download | zlib-5cb718c7fcd502ab38c38ed0cebe5fdf3cdd3a47.tar.gz |
Increasing the expected compressed output size
Usage of CRC32 Castagnoli as a hash function for the hash table of symbols
used for compression has a side effect where for compression level [4, 5]
it will increase the required output buffer size by 0.1% (i.e. less than 1%)
for a high entropy input (i.e. random data).
To avoid a scenario where client code would fail while compressing data,
this patch will increase the compressBound by 0.8% (i.e. 8x than worst
case scenario).
Validated using zlib_bench with the data from the 'Random Compression
Challenge' on both Intel and Arm.
Bug: 990489
Change-Id: I86c6ab09fce6ab09b45c502221864400b86a7d80
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1733790
Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
Reviewed-by: Chris Blume <cblume@chromium.org>
Reviewed-by: Adenilson Cavalcanti <cavalcantii@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#684170}
Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
Cr-Mirrored-Commit: 6c37c10f93e9d3ac37fd3018f15ec51340d58369
Diffstat (limited to 'compress.c')
-rw-r--r-- | compress.c | 14 |
1 files changed, 12 insertions, 2 deletions
@@ -81,6 +81,16 @@ int ZEXPORT compress (dest, destLen, source, sourceLen) uLong ZEXPORT compressBound (sourceLen) uLong sourceLen; { - return sourceLen + (sourceLen >> 12) + (sourceLen >> 14) + - (sourceLen >> 25) + 13; + sourceLen = sourceLen + (sourceLen >> 12) + (sourceLen >> 14) + + (sourceLen >> 25) + 13; + /* FIXME(cavalcantii): usage of CRC32 Castagnoli as a hash function + * for the hash table of symbols used for compression has a side effect + * where for compression level [4, 5] it will increase the output buffer size + * by 0.1% (i.e. less than 1%) for a high entropy input (i.e. random data). + * To avoid a scenario where client code would fail, for safety we increase + * the expected output size by 0.8% (i.e. 8x more than the worst scenario). + * See: http://crbug.com/990489 + */ + sourceLen += sourceLen >> 7; // Equivalent to 1.0078125 + return sourceLen; } |