aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2020-12-05Implement 6D parallelization with 1D and no tilingMarat Dukhan
2020-12-05Use __STDC_NO_ATOMICS__ to detect C11 compilers without stdatomic.hMarat Dukhan
Replace MSVC-specific check from #10
2020-12-05Support pre-C11 GCC intrinsics for atomicsMarat Dukhan
2020-10-05Fix MSVC build (#10)peterjc123
Fix MSVC build
2020-05-26Use cpuinfo_get_current_uarch_index_with_default for parallelization with uarchMarat Dukhan
index
2020-05-263D/4D/5D parallelization functions with 1D or no tilingMarat Dukhan
2020-05-16Guard against generating ARM yield instruction for unsupporting processorsMarat Dukhan
2020-05-08Reorder C11 atomics before MSVC x64 atomicsMarat Dukhan
clang-cl, which supports both, should prefer C11 atomics
2020-05-08Use platform-specific yield/pause instructionsMarat Dukhan
2020-05-07MSVC-compatible FPU state functionsMarat Dukhan
2020-05-07Thumb-1 compatible assembly for disable_fpu_denormalsMarat Dukhan
2020-05-04Avoid including stdatomic.h in any WAsm buildsMarat Dukhan
2020-05-02Fast path using atomic decrement instead of atomic compare-and-swapMarat Dukhan
50% higher throughput on x86 (disabled on other platforms)
2020-04-22Reorder C11 atomics before MSVC atomicsMarat Dukhan
clang-cl, which supports both, should prefer C11 atomics
2020-04-16Recognize Cygwin as WindowsMarat Dukhan
2020-04-14Use load-acquire + store-release on synchronization variablesMarat Dukhan
Synchronization using relaxed atomics + fences instead of LA/SR violates C11/C++11 memory model and cause failures under thread sanitizer
2020-04-10Support Windows on ARM/ARM64Marat Dukhan
2020-04-10Replace atomic fetch_sub with decrement_fetch primitiveMarat Dukhan
Decrement-fetch is a closer match to the primitive used in implementation
2020-04-10Add compiler barriers to MSVC atomics implementationMarat Dukhan
2020-04-10Fix race condition in Windows implementationMarat Dukhan
The command event for the next command must be reset before write-release of the new command, because as soon as the worker threads observe the new command, they may complete it and switch to waiting on the next command event
2020-04-10Rewrite work spreading between threadsMarat Dukhan
- Avoid word x word -> doubleword multiplication - Avoid doubleword / word -> word division - Replace remaining division with multiplication via FXdiv - Improve portability through removal of platform-dependent multiply_divide function
2020-04-10Direct implementation pthreadpool_try_decrement_relaxed_size_tMarat Dukhan
Replace implementation of pthreadpool_try_decrement_relaxed_size_t on top of emulated pthreadpool_compare_exchange_weak_relaxed_size_t with a direct implementation using platform intrinsics
2020-04-10Return static thread pool pointer in shim implementationMarat Dukhan
Makes pthreadpool tests pass in WebAssembly builds
2020-04-07Minor fixes in Windows implementationMarat Dukhan
2020-04-07Windows implementation using EventsMarat Dukhan
2020-04-05Fix erroneous narrowing in pthreadpool_fetch_sub_relaxed_size_tMarat Dukhan
2020-04-05Optimized pthreadpool_parallelize_* functionsMarat Dukhan
Eliminate function call and division per each processed item in the multi-threaded case
2020-04-01Implementation using Grand Central DispatchMarat Dukhan
2020-04-01Refactor pthreadpool implementationMarat Dukhan
Split implementation into two types of components: - Components dependent on threading API - Portable components
2020-04-01Remove unused per-thread wakeup_condvarMarat Dukhan
2020-03-26Microarchitecture-aware parallelization functionsMarat Dukhan
2020-03-26Refactor multi-threaded case of parallelization functionsMarat Dukhan
- Extract multi-threaded setup logic into a generalized pthreadpool_parallelize function - Call into pthreadpool_parallelize directly from tiled and 2+-dimensional functions
2020-03-23Implement atomic_decrement with LL-SC on ARM/ARM64Marat Dukhan
2020-03-23Minor refactoring in pthreadpool_destroyMarat Dukhan
2020-03-23Fix race conditions in non-futex implementationMarat Dukhan
2020-03-23Futex-based WebAssembly+Threads implementationMarat Dukhan
2020-03-23Support WebAssembly+Threads buildMarat Dukhan
- Abstract away atomic operations and data type from the source file - Polyfill atomic operations for Clang targeting WAsm+Threads - Set Emscripten link options for WebAssembly+Threads builds
2020-03-23Remove redundant barriersMarat Dukhan
2020-03-23Simplify parallel task initializationMarat Dukhan
2020-03-23Avoid spinning thread-pool when task has the only itemMarat Dukhan
2020-03-05Remove Native Client supportMarat Dukhan
2020-03-05PTHREADPOOL_FLAG_YIELD_WORKERS flag to bypass spin-waitMarat Dukhan
Makes it possible to signal the last operation in a sequence of computations, so pthreadpool workers don't spin in vain.
2020-03-05Minor cleanupMarat Dukhan
2020-03-01Build on Windows/mingw64 (#6)mattn
Support Windows/mingw64 build
2019-10-19Switch to C11 atomics to synchronizationMarat Dukhan
2019-10-08Make inline assembly compatible with old toolchainMarat Dukhan
Fix #4
2019-09-30Fix typo in commentMarat Dukhan
2019-09-30Enable spin-wait in the main threadMarat Dukhan
2019-09-30New pthreadpool_parallelize_* APIMarat Dukhan
2019-09-30Enable spin-wait in worker threadsMarat Dukhan