external/XNNPACK.git - [no description]

Age	Commit message (Collapse)	Author
2022-09-01	Harmonize filenames of SAMPLES=1 BFLY4 microkernels	Marat Dukhan
	PiperOrigin-RevId: 471584691
2022-09-01	Harmonize naming of specialized BFLY4 microkernels	Marat Dukhan
	- Rename m1 to samples1 - Separate samples1 from microkernel name by underscore PiperOrigin-RevId: 471574436
2022-08-30	Add primary_tile as an argument to DWCONV packing functions	Zhi An Ng
	Add specific unit tests for packing routines. Microkernel tests are updated to simply pass the primary_tile as kernel height, this result is a nop change, testing kernel sizes smaller than primary tile will be added in a follow-up. PiperOrigin-RevId: 471142575
2022-08-29	Filterbank Accumulate in ARM assembly	Frank Barchard
	PiperOrigin-RevId: 470890302
2022-08-26	bfly4m1 NEON microkernel	Frank Barchard
	PiperOrigin-RevId: 470331404
2022-08-24	FILTERBANK-ACCUMULATE Neon microkernels with unweights accumulator set to 0	Frank Barchard
	PiperOrigin-RevId: 469816851
2022-08-24	Add test for FP16 rewrite when an external output happens to be an input to ↵	Zhi An Ng
	another node This can happen when a subgraph is split because an operation is not supported by XNNPACK. The FP16 rewriting code does not account for this: When counting the number of external inputs, we assert that a node's input value which has a fp32_id is an external input, this is not always true, as it could be an external output PiperOrigin-RevId: 469766360
2022-08-23	Fix FILTERBANK-ACCUMULATE microkernels	Marat Dukhan
	Make FILTERBANK-ACCUMULATE microkernels match TFLM audio_frontend semantics PiperOrigin-RevId: 469635373
2022-08-23	Space to Depth operator	Alan Kelly
	PiperOrigin-RevId: 469514081
2022-08-23	Aarch32 filterbank-accumulate assembly	Frank Barchard
	PiperOrigin-RevId: 469506037
2022-08-23	Space to Depth operator	XNNPACK Team
	PiperOrigin-RevId: 469480735
2022-08-23	Space to Depth operator	Alan Kelly
	PiperOrigin-RevId: 469456395
2022-08-22	filterbank-accumulate use uint8 for weight count table	Frank Barchard
	- Remove input and weight offsets which are sequential - Reduce weight count table from uint16 to uint8. Maximum value is 13. PiperOrigin-RevId: 469371512
2022-08-22	bfly4m1 remove multiplies by 1 and 0.	Frank Barchard
	PiperOrigin-RevId: 469254069
2022-08-22	U64->U32 VSQRTSHIFT microkernel	Marat Dukhan
	PiperOrigin-RevId: 469114060
2022-08-21	U64 SQRT evaluation stubs	Marat Dukhan
	PiperOrigin-RevId: 469079271
2022-08-21	Evaluation stubs for U32 SQRT using F32 SQRT	Marat Dukhan
	PiperOrigin-RevId: 469077131
2022-08-20	Specialized M1 variant of bfly4 scalar	Frank Barchard
	PiperOrigin-RevId: 468906177
2022-08-19	Add inplace tests for fftr, filterbank-subtract, vlog, vlshift and window ↵	Frank Barchard
	microkernels - sort applied to build files and headers. PiperOrigin-RevId: 468770448
2022-08-18	Specialized WINDOW microkernels for shift by constants	Frank Barchard
	- Shift by 12 replaces vshlq + vqmovn with a vqshrn_n. - Shift by 15 replaces vmull + vshlq + vqmovn with vqdmulhq PiperOrigin-RevId: 468510420
2022-08-17	u32 filterbank-subtract scalar microkernel	Frank Barchard
	- Scalar microkernel, test and benchmark - Input and output are uint32 PiperOrigin-RevId: 468313409
2022-08-17	Use mulext for filterback-accumulate to multiply 2 uint32 values to produce ↵	Frank Barchard
	a uint64 value. PiperOrigin-RevId: 468301603
2022-08-17	BF16 GEMM microkernels for NEON & NEON-BF16	Marat Dukhan
	PiperOrigin-RevId: 468263623
2022-08-17	Fix CMake build	Marat Dukhan
	normalization target needs to include "src" directory as it depends on xnnpack/math.h PiperOrigin-RevId: 468229416
2022-08-17	Depth to space nhwc uses transpose.	Alan Kelly
	PiperOrigin-RevId: 468220740
2022-08-17	Depth to space nchw2nhwc uses transpose	Alan Kelly
	PiperOrigin-RevId: 468186161
2022-08-16	Enable ARM SIMD32 microkernels for pre-NEON AArch32 processors	Marat Dukhan
	PiperOrigin-RevId: 468105144
2022-08-16	Rename ARMV6SIMD to ARMSIMD32	Marat Dukhan
	ARMv6 SIMD is deceptive because M-profile ARM cores don't support these instructions until ARMv7 PiperOrigin-RevId: 468071398
2022-08-16	Move post-operation structs into separate file and lib	Zhi An Ng
	This ensures that our microkernel tests (to be added) can depend on post-operation without depending on :operators. PiperOrigin-RevId: 468002526
2022-08-13	u32 filterbank-accumulate NEON and scalar microkernels	Frank Barchard
	- input is uint32 and output is uint64 - multiply inputs by uint16 weights PiperOrigin-RevId: 467435458
2022-08-09	CS16 fftr scalar microkernel	Frank Barchard
	- Scalar C microkernel, test and benchmark PiperOrigin-RevId: 466519212
2022-08-08	Remove unused FFT tables	Frank Barchard
	PiperOrigin-RevId: 466191721
2022-08-06	bfly4 template generate SAMPLE_TILE up to x4	Frank Barchard
	- Add fft size of 1024 benchmark - Sort filenames in BUILD files PiperOrigin-RevId: 465786067
2022-08-06	Fix indent for vsquareabs template	Frank Barchard
	- Apply sort to BUILD and CMakeList.txt PiperOrigin-RevId: 465785238
2022-08-05	CS16 bfly4 microkernel scalar implementation	Frank Barchard
	- scalar microkernel, benchmark and unittest PiperOrigin-RevId: 465672474
2022-08-03	Rename table loglut to vlog	Frank Barchard
	- Change size from 130 to 129. - Format to 16 entries per row. Was 12. - vlog to match microkernel name. was log. - Remove lut from names. table and look up table is redundent. PiperOrigin-RevId: 465120515
2022-08-02	U32 VLOG microkernel to compute natural log	Frank Barchard
	- scalar microkernel and benchmark - read uint32 values, write uint16 log of values PiperOrigin-RevId: 464941876
2022-08-01	Convert AArch32 and AArch64 F32 GEMM and IGEMM microkernels used for default ↵	Zhi An Ng
	CPU case and use them when JIT is enabled This allows us to actually test and run JIT generated code on emulators, previously there wasn't any generators configured, so no JIT code is generated, and when creating convolution operators we would fall back on the assembly microkernels. - This adds assertion in convolution tests that specify use_jit(true) to ensure that code is generated by checking code cache size - guard tests that specify use_jit(true) behind XNN_ENABLE_JIT, we really want those tests to run with JIT code, and we can only have JIT code if it is enabled - update script to convert assembly to JIT code - convert both GEMM and IGEMM ld128 microkernels to JIT, we don't convert the 1x8 because they are C code, and right now we don't need them yet (no JIT tests exercise this path), we can add this later - the JIT code does not specialize on max_mr, that is the same behavior as currently, we don't yet enable microkernels for mr=4 (though we have it), this can also be fixed later - disable JIT depthwise convolution tests, we don't have JIT dwconv microkernels yet, so those tests were incorrectly passing as they were using the assembly microkernels PiperOrigin-RevId: 464637714
2022-08-01	Fix CMake build	Marat Dukhan
	PiperOrigin-RevId: 464549548
2022-07-29	S16 Front End Microkernel improve naming consistency	Frank Barchard
	-Change channels to batch -Add _ before lo and hi -Assert unsigned parameters != 0 -Fix typo in copyright -Sort filenames in BUILD files PiperOrigin-RevId: 464129540
2022-07-28	CS16 squareabs microkernel for NEON	Frank Barchard
	- Rename channels to batch for C version PiperOrigin-RevId: 463978900
2022-07-28	CS16 squareabs microkernel	Frank Barchard
	- Scalar implementation PiperOrigin-RevId: 463921828
2022-07-27	Refactor declarations of microkernel parameters	Marat Dukhan
	- Extract declarations of microkernel parameters into microparams.h - Group and document microkernel parameters - Rename params-init accordingly - Make microkernels depend only on microparams.h and not params.h PiperOrigin-RevId: 463747649
2022-07-27	S16 vlshift microkernel	Frank Barchard
	- Shifts input to the left by specified amount. - Scalar and Neon implementations PiperOrigin-RevId: 463734229
2022-07-26	Rename absmax to maxabs	Frank Barchard
	- Remove duplicate channels variable. - Post increment pointers. - Sort BUILD file names. PiperOrigin-RevId: 463470697
2022-07-26	U32 SQRT evaluation stub for the Hashemian algorithm	Marat Dukhan
	PiperOrigin-RevId: 463282575
2022-07-25	s16 rabsmax C and NEON microkernels	Frank Barchard
	- Returns a single maximum absolute value in an array of int16_t PiperOrigin-RevId: 463225292
2022-07-25	U32 SQRT evaluation stub for multiplication-/division-free algorithm	Marat Dukhan
	PiperOrigin-RevId: 463085532
2022-07-24	Additional scalar SQRT U32 evaluation stubs	Marat Dukhan
	PiperOrigin-RevId: 462985602
2022-07-22	S16 WINDOW NEON microkernels	Frank Barchard
	- Multiplies input by weights, shifts and clamps PiperOrigin-RevId: 462683877