mixed-precision

Star

Here are 30 public repositories matching this topic...

lucasschoenhold / LLMs

Star

Learning how to use LLMs by exploring fine-tuning and inference. Focus on technical aspects and practical applications.

transformers mixed-precision apple-silicon-support llm

Updated May 17, 2024
Jupyter Notebook

hinofafa / torch_accelerator

Star

Experiments to accelerate GPU device for PyTorch training

pytorch gpu-acceleration mixed-precision tensorcore gpu-profiler

Updated Dec 15, 2021
Jupyter Notebook

Ahmad-Shawahna / FxP-QNet

Star

A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation for Efficient Hardware Acceleration on Edge Devices

acceleration deep-learning neural-networks accuracy quantization model-compression efficient-inference mixed-precision fixed-point-arithmetic resource-constrained-devices

Updated Dec 10, 2021

AkashSDas / cassava-leaf-disease-classification

Star

Deep learning solution for Cassava Leaf Disease Classification, a Kaggle's Research Code Competition using Tensorflow.

cross-validation kaggle kaggle-competition image-classification data-augmentation kaggle-dataset tpu mixup unbalanced-data tensorflow-pipeline mixed-precision xla tensorflow2 cutmix albumentations cassava-leaf-classifcation fmix

Updated Apr 23, 2021
Jupyter Notebook

zjykzj / YOLOv1

Star

You Only Look Once: Unified, Real-Time Object Detection

python pytorch apex yolo imagenet nvidia-docker pascal-voc yolov1 mixed-precision nvidia-apex distributed-data-parallel yolov1-loss

Updated Jul 9, 2023
Python

mfuntowicz / RNet

Star

PyTorch RNet implementation with Distributed and Mixed-Precision training support.

python pytorch distributed rnet mixed-precision

Updated Apr 18, 2019
Python

lnugraha / CG-Mixed-Precision

Star

Hybrid-Precision Analysis on CG Solver (H.A.C.S). Merging single and double precision to generate a fast yet accurate CG solver

cuda cpp11 conjugate-gradient lattice-qcd mixed-precision

Updated May 29, 2020
C++

enp1s0 / cuMpSGEMM

Star

Fast SGEMM emulation on Tensor Cores

gpu cuda gemm half-precision mixed-precision tensorcore tensorcores fp32

Updated Nov 20, 2023
Cuda

sayakpaul / Mixed-Precision-Training-in-tf.keras-2.0

Star

This repository contains notebooks showing how to perform mixed precision training in tf.keras 2.0

deep-learning nvidia mixed-precision tensorflow2

Updated Dec 15, 2019
Jupyter Notebook

at-aaims / OpenMxP

Star

This is the open source version of HPL-MXP. The code performance has been verified on Frontier

performance hpc mixed-precision

Updated May 1, 2023
C++

kentaroy47 / pytorch-cifar10-fp16

Star

Let's train CIFAR 10 Pytorch with Half-Precision!

training cnn pytorch cifar10 fp16 mixed-precision mixed-precision-training

Updated Oct 25, 2019
Python

Andras7 / gpt2-pytorch

Star

Extremely simple and understandable GPT2 implementation with minor tweaks

transformers pytorch mixed-precision gpt2 sentencepiece

Updated Dec 6, 2019
Python

wu-kan / HPL-AI

Star

An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3

benchmarking performance hpc mpi gmres hpl mixed-precision linpack hpl-ai

Updated May 30, 2021
C

tlkh / pycon-sg19-tensorflow-tutorial

Star

PyCon SG 2019 Tutorial: Optimizing TensorFlow Performance

deep-learning tensorflow keras nvidia mixed-precision

Updated Nov 20, 2019
Jupyter Notebook

qleenju / PDPU

Star

PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications

systemverilog posit unum dot-product mixed-precision posit-arithmetic posit-arithmetic-generator arithmetic-units

Updated May 5, 2023
SystemVerilog

EEESlab / CMix-NN

Star

CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices

iot arm stm32 inference cnn cmsis stm32f4 stm32f7 arm-cortex-m7 stm32l4 edge-computing stm32h7 mixed-precision cmsis-nn arm-cortex-m4 quantized-neural-networks edge-ai tinyml

Updated Mar 19, 2020
C

enp1s0 / ozIMMU

Star

FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme

cuda gemm mixed-precision tensorcore tensorcores

Updated Jan 20, 2024
Cuda

andreped / GradientAccumulator

Star

🎯 Accumulated Gradients for TensorFlow 2

deep-learning tensorflow gpu keras tf2 hacktoberfest multi-gpu distributed-training float16 tpu batch-size mixed-precision gradient-accumulation tensorflow2 huggingface adaptive-gradient-clipping accumulated-gradients memory-constraints accumulated-batch-normalization

Updated Feb 11, 2024
Python

Zhen-Dong / BitPack

Star

BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

memory pytorch quantization model-compression mixed-precision quantized-neural-networks

Updated Feb 7, 2023
Python

rickiepark / deep-learning-with-python-2nd

Star

<케라스 창시자에게 배우는 딥러닝 2판> 도서의 코드 저장소

Updated Mar 9, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the mixed-precision topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mixed-precision topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mixed-precision

Here are 30 public repositories matching this topic...

lucasschoenhold / LLMs

hinofafa / torch_accelerator

Ahmad-Shawahna / FxP-QNet

AkashSDas / cassava-leaf-disease-classification

zjykzj / YOLOv1

mfuntowicz / RNet

lnugraha / CG-Mixed-Precision

enp1s0 / cuMpSGEMM

sayakpaul / Mixed-Precision-Training-in-tf.keras-2.0

at-aaims / OpenMxP

kentaroy47 / pytorch-cifar10-fp16

Andras7 / gpt2-pytorch

wu-kan / HPL-AI

tlkh / pycon-sg19-tensorflow-tutorial

qleenju / PDPU

EEESlab / CMix-NN

enp1s0 / ozIMMU

andreped / GradientAccumulator

Zhen-Dong / BitPack

rickiepark / deep-learning-with-python-2nd

Improve this page

Add this topic to your repo