external/arm-optimized-routines.git

Age	Commit message (Collapse)	Author
2021-02-17	Update copyright years	Szabolcs Nagy
	Scripted copyright year updates based on git committer date.
2021-02-12	string: add __mtag_tag_zero_region	Szabolcs Nagy
	Add optimized __mtag_tag_zero_region(dst, len) operation to AOR. It tags the memory according to the tag of the dst pointer then memsets it to 0 and returns dst. It requires MTE support. The memory remains untagged if tagging is not enabled for it. The dst must be 16 bytes aligned and len must be a multiple of 16. Similar to __mtag_tag_region, but uses the zeroing instructions.
2021-02-12	string: add __mtag_tag_region	Szabolcs Nagy
	Add optimized __mtag_tag_region(dst, len) operation to AOR. It tags the given memory region according to the tag of the dst pointer and returns dst. It requires MTE support. The memory remains untagged if tagging is not enabled for it. The dst must be 16 bytes aligned and len must be a multiple of 16.
2021-01-04	string/test: Fix strrchr '\0' error report	Richard Henderson
	The error report was copied from the seekchar test above, and needs adjustment to match the gating IF.
2020-05-28	string: Add MTE support to string tests.	Branislav Rankov
	Set taggs for every test case so that boundaries are as narrow as possible. There is no handling of tag faults, so the test will crash if there is a MTE problem. The implementations that are not compatible are excluded, including the standard symbols that may come from an mte incompatible libc.
2020-05-22	string: Cleanup strchrnul test	Wilco Dijkstra
	Clean up code and improve test coverage.
2020-05-22	string: Cleanup strchr test	Wilco Dijkstra
	Clean up code and improve test coverage.
2020-05-22	string: Cleanup strrchr test	Wilco Dijkstra
	Clean up code and improve test coverage.
2020-05-22	string: Cleanup stpcpy test	Wilco Dijkstra
	Cleanup stpcpy test and improve test coverage.
2020-05-22	string: Cleanup strcpy test	Wilco Dijkstra
	Cleanup strcpy test and improve test coverage.
2020-05-22	string: Cleanup strnlen test	Wilco Dijkstra
	Cleanup strnlen test and improve test coverage.
2020-05-22	string: Cleanup strlen test	Wilco Dijkstra
	Cleanup strlen test and improve test coverage.
2020-05-20	string: Add optimized strcpy-mte and stpcpy-mte	Wilco Dijkstra
	Add optimized MTE-compatible strcpy-mte and stpcpy-mte. On various micro architectures the speedup over the non-MTE version is 53% on large strings and 20-60% on small strings.
2020-05-12	string: Add memrchr test	Wilco Dijkstra
	Add new memrchr test.
2020-05-12	string: Cleanup memchr test	Wilco Dijkstra
	Improve memchr test coverage and cleanup code.
2020-05-12	string: Cleanup strnlen test	Wilco Dijkstra
	Improve strnlen test coverage and cleanup code.
2020-05-12	string: format tests according to GNU style	Szabolcs Nagy
	Use the GNU style consistently in the string test code. Added clang-format guard comments where necessary so the code can be reformated using the clang-format tool and GNU style settings from gcc contrib/clang-format.
2020-04-30	string: ARMv8.5 MTE: Add MTE compatible version of strncmp.	Branislav Rankov
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/strncmp.S Change the case when strings are are misaligned, align the pointers down, and ignore bytes before the start of the string. Carry the part that is not compared to the next comparison. Testing done: string/test/strncmp.c on big endian, little endian and with MTE support. Booted nanodroid with MTE enabled. Bechmarked on Pixel4.
2020-04-30	string: ARMv8.5 MTE: Add MTE compatible version of strcmp.	Branislav Rankov
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/strcmp.S Change the case when strings are are misaligned, align the pointers down, and ignore bytes before the start of the string. Carry the part that is not compared to the next comparison. Testing done: optimized-routines/string/test/strcmp.c on big and little endian. Booted nanodroid with MTE enabled. bionic string tests with MTE enabled. Benchmarks results: Run both bionic benchmarks and glibc benchmarks on Pixel4. Cores A76 and A55.
2020-04-30	ARMv8.5 MTE: Add MTE compatible version of strrchr.	Gabor Kertesz
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/strrchr.S. Testing done: optimized-routines/string/test/strrchr.c Booted nanodroid with MTE enabled. Bionic string tests with MTE enabled. Big endian with Qemu: qemu-aarch64_be
2020-04-29	string: more strchr, strrchr, strchrnul test improvements	Szabolcs Nagy
	Use matching and null characters in the padding area around the string. Remove large input tests.
2020-04-24	string: test cleanups	Szabolcs Nagy
	Tests printed too much output on broken string function and the output was not entirely useful. Added a new header file with some common logic for printing buffers nicely. In str* tests len now means string length (not buffer size which was confusing).
2020-04-08	ARMv8.5 MTE: Add MTE compatible version of strchrnul.	Gabor Kertesz
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/strchr-mte.S and string/aarch64/strchrnul.S Testing done: optimized-routines/string/test/strchrnul.c Booted nanodroid with MTE enabled. bionic string tests with MTE enabled. Big endian with Qemu: qemu-aarch64_be
2020-04-08	ARMv8.5 MTE: Add MTE compatible version of memchr.	Gabor Kertesz
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/memchr.S The 64-bit syndrome value is changed to contain only 16 bytes of data. The 32 byte loop is unrolled to two 16 byte reads. Testing done: optimized-routines/string/test/memchr.c Booted nanodroid with MTE enabled. bionic string tests with MTE enabled.
2020-04-08	string: Add new testcase for memchr.	Gabor Kertesz
	Memchr's length input parameter is unsigned and it's allowed to be huge, so any algorithm that uses that as a signed number, should fail the test. This patch adds cases when the length is actually bigger than the inspected array, but the seeked character is within the valid range.
2020-02-25	ARMv8.5 MTE: Add MTE compatible version of strchr.	Gabor Kertesz
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/strchr.S The 64-bit syndrome value is changed to contain only 16 bytes of data. The 32 byte loop is unrolled by two 16 byte reads.
2020-02-25	string: Add stpcpy	Wilco Dijkstra
	Add support for stpcpy on AArch64.
2020-02-18	ARMv8.5 MTE: Add MTE compatible version of strlen.	Branislav Rankov
	Reading outside the range of the string is only allowed within 16 byte aligned granules when MTE is enabled. This implementation is based on string/aarch64/strlen.S Merged the page cross code into the main path and optimized it. Modified the zeroones mask to ignore the bytes that are loaded but are not part of the string. Made a special case for when there is 8 bytes or less to check before the alignment boundary.
2020-01-14	string: Remove memcpy_bytewise	Wilco Dijkstra
	This was a placeholder for testing the build system before we added optimized string code and thus no longer needed.
2020-01-06	string: Add strrchr	Wilco Dijkstra
	Add strrchr for AArch64. Originally written by Richard Earnshaw, same code is present in newlib, this copy has minor edits for inclusion into the optimized-routines repo.
2019-12-10	aarch64: Combine memcpy and memmove implementations	Krzysztof Koch
	Modify integer and SIMD versions of memcpy to handle overlaps correctly. Make __memmove_aarch64 and __memmove_aarch64_simd alias to __memcpy_aarch64 and __memcpy_aarch64_simd respectively. Complete sharing of code between memcpy and memmove implementations is possible without noticeable performance penalty. This is thanks to moving the source and destination buffer overlap detection after the code for handling small and medium copies which are overlap-safe anyway. Benchmarking shows that keeping two versions of memcpy is necessary because newer platforms favor aligning src over destination for large copies. Using NEON registers also gives a small speedup. However, aligning dst and using general-purpose registers works best for older platforms. Consequently, memcpy.S and memcpy_simd.S contain memcpy code which is identical except for the registers used and src vs dst alignment.
2019-11-26	arch64: Add SIMD version of memcpy	Krzysztof Koch
	Create a new memcpy implementation for targets with the NEON extension. __memcpy_aarch64_simd has been tested on a range of modern microarchitectures. It turned out to be faster than __memcpy_aarch64 on all of them, with a performance improvement of 3-11% depending on the platform.
2019-08-29	string: print passing test cases	Szabolcs Nagy
	Without printing anything on success it is unclear if the right set of functions got hooked up in the test code.
2019-08-28	Import aarch64 sve strrchr	Adhemerval Zanella
	The only difference is changing the symbol name from strrchr to __strrchr_aarch64_sve.
2019-08-28	Add strrchr test	Adhemerval Zanella

2019-08-28	Import aarch64 sve strnlen	Adhemerval Zanella
	The only difference is changing the symbol name from strnlen to __strnlen_aarch64_sve.
2019-08-28	Import aarch64 sve strncmp	Adhemerval Zanella
	The only difference is changing the symbol name from strncmp to __strncmp_aarch64_sve.
2019-08-28	Import aarch64 sve strlen	Adhemerval Zanella
	The only difference is changing the symbol name from strlen to __strlen_aarch64_sve.
2019-08-28	Import aarch64 sve strcpy	Adhemerval Zanella
	The only difference is changing the symbol name from strcpy to __strcpy_aarch64_sve.
2019-08-28	Import aarch64 sve strcmp	Adhemerval Zanella
	The only difference is changing the symbol name from strcmp to __strcmp_aarch64_sve.
2019-08-28	Import aarch64 sve strchr and strchrnul	Adhemerval Zanella
	The only difference is changing the symbol name from strchr/strchrnul to __strchr_aarch64_sve and __strchrnul_aarch64_sve.
2019-08-28	Import aarch64 sve memcmp	Adhemerval Zanella
	The only difference is changing the symbol name from memcmp to __memcmp_aarch64_sve.
2019-08-28	Import aarch64 sve memchr	Adhemerval Zanella
	The only difference is changing the symbol name from memchr to __memchr_aarch64_sve.
2019-08-23	Import arm strlen armv6t2	Adhemerval Zanella
	The only difference is changing the symbol name from strlen to __strlen_armv6t2.
2019-08-23	Import arm strcmp armv6-m	Adhemerval Zanella
	The only difference is changing the symbol name from strcmp to __strcmp_armv6m.
2019-08-23	Import arm strcmp	Adhemerval Zanella
	The only difference is changing the symbol name from strcmp to __strcmp_arm.
2019-08-23	Import arm strcpy	Adhemerval Zanella
	The differences from cortex-strings are: - Simplify the thumb-2/thumb selection by removing the usage of PREFER_SIZE_OVER_SPEED and __OPTIMIZE_SIZE__. - Removed the dumb byte-per-byte loops.
2019-08-23	Import arm memchr	Adhemerval Zanella
	The only difference is changing the symbol name from memchr to __memchr_arm and the final .size directive.
2019-08-23	Import arm memset	Adhemerval Zanella
	The only difference is changing the symbol name from memset to __memset_arm and the final .size directive.
2019-08-23	Import arm memcpy	Adhemerval Zanella
	The only difference is changing the symbol name from memcpy to __memcpy_arm.