inference

Here are 1,186 public repositories matching this topic...

openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated May 16, 2024
C++

AmpereComputingAI / ampere_model_library

Star

AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)

machine-learning natural-language-processing computer-vision model-zoo tensorflow inference pytorch artificial-intelligence arm64 aarch64 ampere armv8-a onnxruntime mlperf-inference dlrm large-language-models yolov8 llama2

Updated May 16, 2024
Python

huggingface / huggingface.js

Star

Utilities to use the Hugging Face Hub API

machine-learning inference hub api-client huggingface

Updated May 16, 2024
TypeScript

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

Updated May 16, 2024
Python

vllm-project / vllm

Star

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated May 16, 2024
Python

openvinotoolkit / openvino_notebooks

Star

📚 Jupyter notebook tutorials for OpenVINO™

machine-learning computer-vision deep-learning inference openvino

Updated May 16, 2024
Jupyter Notebook

google / XNNPACK

Star

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated May 16, 2024
C

argmaxinc / WhisperKit

Star

Swift native on-device speech recognition with Whisper for Apple Silicon

macos swift ios watchos transformers inference speech-recognition pretrained-models whisper visionos

Updated May 16, 2024
Swift

huggingface / text-generation-inference

Star

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated May 16, 2024
Python

xorbitsai / inference

Star

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated May 16, 2024
Python

openvinotoolkit / model_server

Star

A scalable inference server for models optimized with OpenVINO™

kubernetes machine-learning cloud ai deep-learning inference edge dag model-serving serving openvino

Updated May 16, 2024
C++

triton-inference-server / server

Star

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

machine-learning cloud deep-learning gpu inference edge datacenter

Updated May 16, 2024
Python

blefo / FastInference

Star

Seamlessly integrate with top LLM APIs for speedy, robust, and scalable querying. Ideal for developers needing quick, reliable AI-powered responses.

api fast inference distributed llm

Updated May 16, 2024
Python

roboflow / inference

Star

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.