aboutsummaryrefslogtreecommitdiff
path: root/CMakeLists.txt
AgeCommit message (Collapse)Author
2022-09-01Harmonize filenames of SAMPLES=1 BFLY4 microkernelsMarat Dukhan
PiperOrigin-RevId: 471584691
2022-09-01Harmonize naming of specialized BFLY4 microkernelsMarat Dukhan
- Rename m1 to samples1 - Separate samples1 from microkernel name by underscore PiperOrigin-RevId: 471574436
2022-08-30Add primary_tile as an argument to DWCONV packing functionsZhi An Ng
Add specific unit tests for packing routines. Microkernel tests are updated to simply pass the primary_tile as kernel height, this result is a nop change, testing kernel sizes smaller than primary tile will be added in a follow-up. PiperOrigin-RevId: 471142575
2022-08-29Filterbank Accumulate in ARM assemblyFrank Barchard
PiperOrigin-RevId: 470890302
2022-08-26bfly4m1 NEON microkernelFrank Barchard
PiperOrigin-RevId: 470331404
2022-08-24FILTERBANK-ACCUMULATE Neon microkernels with unweights accumulator set to 0Frank Barchard
PiperOrigin-RevId: 469816851
2022-08-24Add test for FP16 rewrite when an external output happens to be an input to ↵Zhi An Ng
another node This can happen when a subgraph is split because an operation is not supported by XNNPACK. The FP16 rewriting code does not account for this: When counting the number of external inputs, we assert that a node's input value which has a fp32_id is an external input, this is not always true, as it could be an external output PiperOrigin-RevId: 469766360
2022-08-23Fix FILTERBANK-ACCUMULATE microkernelsMarat Dukhan
Make FILTERBANK-ACCUMULATE microkernels match TFLM audio_frontend semantics PiperOrigin-RevId: 469635373
2022-08-23Space to Depth operatorAlan Kelly
PiperOrigin-RevId: 469514081
2022-08-23Aarch32 filterbank-accumulate assemblyFrank Barchard
PiperOrigin-RevId: 469506037
2022-08-23Space to Depth operatorXNNPACK Team
PiperOrigin-RevId: 469480735
2022-08-23Space to Depth operatorAlan Kelly
PiperOrigin-RevId: 469456395
2022-08-22filterbank-accumulate use uint8 for weight count tableFrank Barchard
- Remove input and weight offsets which are sequential - Reduce weight count table from uint16 to uint8. Maximum value is 13. PiperOrigin-RevId: 469371512
2022-08-22bfly4m1 remove multiplies by 1 and 0.Frank Barchard
PiperOrigin-RevId: 469254069
2022-08-22U64->U32 VSQRTSHIFT microkernelMarat Dukhan
PiperOrigin-RevId: 469114060
2022-08-21U64 SQRT evaluation stubsMarat Dukhan
PiperOrigin-RevId: 469079271
2022-08-21Evaluation stubs for U32 SQRT using F32 SQRTMarat Dukhan
PiperOrigin-RevId: 469077131
2022-08-20Specialized M1 variant of bfly4 scalarFrank Barchard
PiperOrigin-RevId: 468906177
2022-08-19Add inplace tests for fftr, filterbank-subtract, vlog, vlshift and window ↵Frank Barchard
microkernels - sort applied to build files and headers. PiperOrigin-RevId: 468770448
2022-08-18Specialized WINDOW microkernels for shift by constantsFrank Barchard
- Shift by 12 replaces vshlq + vqmovn with a vqshrn_n. - Shift by 15 replaces vmull + vshlq + vqmovn with vqdmulhq PiperOrigin-RevId: 468510420
2022-08-17u32 filterbank-subtract scalar microkernelFrank Barchard
- Scalar microkernel, test and benchmark - Input and output are uint32 PiperOrigin-RevId: 468313409
2022-08-17Use mulext for filterback-accumulate to multiply 2 uint32 values to produce ↵Frank Barchard
a uint64 value. PiperOrigin-RevId: 468301603
2022-08-17BF16 GEMM microkernels for NEON & NEON-BF16Marat Dukhan
PiperOrigin-RevId: 468263623
2022-08-17Fix CMake buildMarat Dukhan
normalization target needs to include "src" directory as it depends on xnnpack/math.h PiperOrigin-RevId: 468229416
2022-08-17Depth to space nhwc uses transpose.Alan Kelly
PiperOrigin-RevId: 468220740
2022-08-17Depth to space nchw2nhwc uses transposeAlan Kelly
PiperOrigin-RevId: 468186161
2022-08-16Enable ARM SIMD32 microkernels for pre-NEON AArch32 processorsMarat Dukhan
PiperOrigin-RevId: 468105144
2022-08-16Rename ARMV6SIMD to ARMSIMD32Marat Dukhan
ARMv6 SIMD is deceptive because M-profile ARM cores don't support these instructions until ARMv7 PiperOrigin-RevId: 468071398
2022-08-16Move post-operation structs into separate file and libZhi An Ng
This ensures that our microkernel tests (to be added) can depend on post-operation without depending on :operators. PiperOrigin-RevId: 468002526
2022-08-13u32 filterbank-accumulate NEON and scalar microkernelsFrank Barchard
- input is uint32 and output is uint64 - multiply inputs by uint16 weights PiperOrigin-RevId: 467435458
2022-08-09CS16 fftr scalar microkernelFrank Barchard
- Scalar C microkernel, test and benchmark PiperOrigin-RevId: 466519212
2022-08-08Remove unused FFT tablesFrank Barchard
PiperOrigin-RevId: 466191721
2022-08-06bfly4 template generate SAMPLE_TILE up to x4Frank Barchard
- Add fft size of 1024 benchmark - Sort filenames in BUILD files PiperOrigin-RevId: 465786067
2022-08-06Fix indent for vsquareabs templateFrank Barchard
- Apply sort to BUILD and CMakeList.txt PiperOrigin-RevId: 465785238
2022-08-05CS16 bfly4 microkernel scalar implementationFrank Barchard
- scalar microkernel, benchmark and unittest PiperOrigin-RevId: 465672474
2022-08-03Rename table loglut to vlogFrank Barchard
- Change size from 130 to 129. - Format to 16 entries per row. Was 12. - vlog to match microkernel name. was log. - Remove lut from names. table and look up table is redundent. PiperOrigin-RevId: 465120515
2022-08-02U32 VLOG microkernel to compute natural logFrank Barchard
- scalar microkernel and benchmark - read uint32 values, write uint16 log of values PiperOrigin-RevId: 464941876
2022-08-01Convert AArch32 and AArch64 F32 GEMM and IGEMM microkernels used for default ↵Zhi An Ng
CPU case and use them when JIT is enabled This allows us to actually test and run JIT generated code on emulators, previously there wasn't any generators configured, so no JIT code is generated, and when creating convolution operators we would fall back on the assembly microkernels. - This adds assertion in convolution tests that specify use_jit(true) to ensure that code is generated by checking code cache size - guard tests that specify use_jit(true) behind XNN_ENABLE_JIT, we really want those tests to run with JIT code, and we can only have JIT code if it is enabled - update script to convert assembly to JIT code - convert both GEMM and IGEMM ld128 microkernels to JIT, we don't convert the 1x8 because they are C code, and right now we don't need them yet (no JIT tests exercise this path), we can add this later - the JIT code does not specialize on max_mr, that is the same behavior as currently, we don't yet enable microkernels for mr=4 (though we have it), this can also be fixed later - disable JIT depthwise convolution tests, we don't have JIT dwconv microkernels yet, so those tests were incorrectly passing as they were using the assembly microkernels PiperOrigin-RevId: 464637714
2022-08-01Fix CMake buildMarat Dukhan
PiperOrigin-RevId: 464549548
2022-07-29S16 Front End Microkernel improve naming consistencyFrank Barchard
-Change channels to batch -Add _ before lo and hi -Assert unsigned parameters != 0 -Fix typo in copyright -Sort filenames in BUILD files PiperOrigin-RevId: 464129540
2022-07-28CS16 squareabs microkernel for NEONFrank Barchard
- Rename channels to batch for C version PiperOrigin-RevId: 463978900
2022-07-28CS16 squareabs microkernelFrank Barchard
- Scalar implementation PiperOrigin-RevId: 463921828
2022-07-27Refactor declarations of microkernel parametersMarat Dukhan
- Extract declarations of microkernel parameters into microparams.h - Group and document microkernel parameters - Rename params-init accordingly - Make microkernels depend only on microparams.h and not params.h PiperOrigin-RevId: 463747649
2022-07-27S16 vlshift microkernelFrank Barchard
- Shifts input to the left by specified amount. - Scalar and Neon implementations PiperOrigin-RevId: 463734229
2022-07-26Rename absmax to maxabsFrank Barchard
- Remove duplicate channels variable. - Post increment pointers. - Sort BUILD file names. PiperOrigin-RevId: 463470697
2022-07-26U32 SQRT evaluation stub for the Hashemian algorithmMarat Dukhan
PiperOrigin-RevId: 463282575
2022-07-25s16 rabsmax C and NEON microkernelsFrank Barchard
- Returns a single maximum absolute value in an array of int16_t PiperOrigin-RevId: 463225292
2022-07-25U32 SQRT evaluation stub for multiplication-/division-free algorithmMarat Dukhan
PiperOrigin-RevId: 463085532
2022-07-24Additional scalar SQRT U32 evaluation stubsMarat Dukhan
PiperOrigin-RevId: 462985602
2022-07-22S16 WINDOW NEON microkernelsFrank Barchard
- Multiplies input by weights, shifts and clamps PiperOrigin-RevId: 462683877