aboutsummaryrefslogtreecommitdiff
path: root/src/f32-igemm
AgeCommit message (Expand)Author
2022-08-03Return xnn_status instead of hard coded integers in JIT generatorsZhi An Ng
2022-08-02Change JIT generator to return uint8_t instead of xnn_status to remove depend...Zhi An Ng
2022-08-01Convert AArch32 and AArch64 F32 GEMM and IGEMM microkernels used for default ...Zhi An Ng
2022-07-27Refactor declarations of microkernel parametersMarat Dukhan
2022-04-28Remove JIT 6x8 AArch64 A75 microkernel, upto6x8 is the equivalentZhi An Ng
2022-04-286x8 JIT GEMM/IGEMM microkernels do 4 prfm for Cortex A75Frank Barchard
2022-04-27Fix JIT 4x8 AArch64 A75 microkernels without prefetchZhi An Ng
2022-04-266x8 GEMM JIT fix for reload of clamp values in epilogueFrank Barchard
2022-04-25Adjust 6x8 IGEMM prfm offsets for JIT cortex_a75 microkernelFrank Barchard
2022-04-25Adjust 6x8 IGEMM prfm offsets for cortex_a75 microkernelFrank Barchard
2022-04-22Add JIT F32 GEMM and IGEMM microkernels for AArch64 A75 that support up to ma...Zhi An Ng
2022-04-21Add lint markers to track assembly and the JIT microkernels that are generate...Zhi An Ng
2022-04-18Pass max_mr to JIT generated microkernelsZhi An Ng
2022-04-14Specialize on min/max in F32 GEMM and IGEMM AArch64 A75 microkernelsZhi An Ng
2022-04-13Generate F32 IGEMM 4x8 microkernels for A75 from assemblyZhi An Ng
2022-04-07Remove WAsm DWCONV/GEMM/IGEMM microkernels with LINEAR activationsMarat Dukhan
2022-04-05Generate a full set of WAsm SIMD IGEMM microkernels with LINEAR/RELU activationsMarat Dukhan
2022-04-01assert(nc_mod_nr < 8) for JIT microkernel generators.Frank Barchard
2022-04-01assert(ks != 0) for all JIT IGEMM microkernel generators.Frank Barchard
2022-03-31Relaxed SIMD microkernels with Relaxed FMAMarat Dukhan
2022-03-31assert(kc % sizeof(float) == 0) for F32 JIT microkernel generators.Frank Barchard
2022-03-31assert(params != nullptr) for all JIT microkernel generators.Frank Barchard
2022-03-31Replace MINMAX parameter with ARCH in WAsm SIMD microkernel templatesMarat Dukhan
2022-03-31Relaxed SIMD versions of F32 GEMM/IGEMM/DWCONV microkernelsMarat Dukhan
2022-03-30Specify MIN/MAX instructions in WAsm SIMD microkernelsMarat Dukhan
2022-03-21FP32 tuned prfm offsets for 1x8 GEMM/IGEMM microkernels for Cortex A53Frank Barchard
2022-03-21FP32 prfm version of 1x8 GEMM/IGEMM microkernels for Cortex A53Frank Barchard
2022-03-16FP32 4x2 IGEMM assembly microkernel for Cortex A75Frank Barchard
2022-03-146x2 GEMM/IGEMM microkernelsFrank Barchard
2022-03-14Fix typo in A75 microkernel comments: 1nd -> 1stFrank Barchard
2022-03-14FP32 4x2 LD64 IGEMM assembly microkernelFrank Barchard
2022-03-10FP32 6x8 GEMM/IGEMM microkernel epilogue sub soonerFrank Barchard
2022-03-09NEON Register use for FP32 and FP16 GEMM/IGEMM comments updatedFrank Barchard
2022-03-09NEON F16/F32 GEMM/IGEMM renumber output registers sequentiallyFrank Barchard
2022-03-07Comment change - 1 float instead of 1 floats.Frank Barchard
2022-02-16Apply formatting to assemblyFrank Barchard
2022-02-15Implement nop, hlt and code alignment, align at the end of each generated cod...Zhi An Ng
2022-02-08FP32 GEMM/IGEMM AArch64 for Cortex A55r0Frank Barchard
2022-02-07FP32 GEMM/IGEMM AArch32 for Cortex A55r0Frank Barchard
2022-02-04Add missing <limits> header to JIT generator filesZhi An Ng
2022-02-04Change JIT generators to take nc % nr instead of nc directlyZhi An Ng
2022-02-03Port aarch64 F32 IGEMM 1x8 A75 microkernel to JIT, add tests, benchmarks, ena...Zhi An Ng
2022-02-03Define constants for +/- infinity to check for clamping in JIT generatorsZhi An Ng
2022-02-03Specialize F32 IGEMM for a75 on mix/maxZhi An Ng
2022-02-03Convert F32 IGEMM for A75 to JIT, add testsZhi An Ng
2022-02-02Make void* params argument of JIT generators constZhi An Ng
2022-01-31Pad K to a multiple of SR in GEMM/IGEMM microkernelsMarat Dukhan
2022-01-27Remove wb from JIT aarch32 instructions, use mem operand and ++ insteadZhi An Ng
2022-01-25Remove 3 blank lines after last jit assembly instruction before end of functionFrank Barchard
2022-01-25Avoid importing the entire xnnpack namespace in aarch32 assemblerZhi An Ng