C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
-
Updated
May 28, 2024 - C++
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
Performance-portable, length-agnostic SIMD with runtime dispatch
(REOS) Radar and Electro-Optical Simulation Framework written in C++.
Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐
Collection of incredibly fast hashmaps
DSP library for signal processing
A few classes for parsing and serializing objects from/into JSON, in C++ - very rapidly.
Fast decoder for VByte-compressed integers
C++ header only template library designed to make it easier to write high-performance SIMD (SSE, AVX, Neon) and multi-threaded code.
Parallel Programming course projects demonstrating various parallelism techniques with SIMD SSE3, OMP, and POSIX threads, including Intel Parallel Studio for analysis and parallelization.
The Assembly and C Exercise aims to practice assembly language programming using SSE/AVX instructions and C SSE/AVX intrinsics.
Introduction to cache coherence: false sharing, MESI protocol and vectorization
The Vector Optimized Library of Kernels
C & Assembly optimized version of the Stochastic Gradient Descent x SoftSVM x Polynomial Kernel Method algorithm
High-speed parser with vector instructions
High performance algorithms in C#: SIMD/SSE, multi-core and faster
SIMD Vector Classes for C++
Neon/AVX simd library, vector size agnostic
Add a description, image, and links to the simd-instructions topic page so that developers can more easily learn about it.
To associate your repository with the simd-instructions topic, visit your repo's landing page and select "manage topics."