OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
-
Updated
May 16, 2024 - C++
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)
Utilities to use the Hugging Face Hub API
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
A high-throughput and memory-efficient inference and serving engine for LLMs
📚 Jupyter notebook tutorials for OpenVINO™
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Swift native on-device speech recognition with Whisper for Apple Silicon
Large Language Model Text Generation Inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
A scalable inference server for models optimized with OpenVINO™
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Seamlessly integrate with top LLM APIs for speedy, robust, and scalable querying. Ideal for developers needing quick, reliable AI-powered responses.
A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
Making large AI models cheaper, faster and more accessible
An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Faster Whisper transcription with CTranslate2
A universal scalable machine learning model deployment solution
Add a description, image, and links to the inference topic page so that developers can more easily learn about it.
To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."