Quantizing LLMs using GPTQ
-
Updated
Dec 31, 2023 - Jupyter Notebook
Quantizing LLMs using GPTQ
Personal GitHub repository for stashing resources on Large Language Models (LLM), including Jupyter Notebooks on open source LLMs, use-cases with Langchain and R&D paper review.
This project will develop a NEPSE chatbot using an open-source LLM, incorporating sentence transformers, vector database and reranking.
Run gguf LLM models in Latest Version TextGen-webui
A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system
Conversation AI model for open domain dialogs
This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).
Private self-improvement coaching with open-source LLMs
ChatSakura:Open-source multilingual conversational model.(开源多语言对话大模型)
A guide about how to use GPTQ models with langchain
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
🪶 Lightweight OpenAI drop-in replacement for Kubernetes
Run any Large Language Model behind a unified API
🦖 X—LLM: Cutting Edge & Easy LLM Finetuning
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Add a description, image, and links to the gptq topic page so that developers can more easily learn about it.
To associate your repository with the gptq topic, visit your repo's landing page and select "manage topics."