New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGUF quantization meta-data format #797
Comments
Hi, you would better have a look at llama.cpp : |
@mobicham here is the spec for GGUF for you to use: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello!
Are there some resources that explain how the quantized parameters are structured in a GGUF file?
We are interested in porting HQQ-quantized models into GGUF format, but in order to do that, we need to know exactly how it is stored.
We basically need to know:
Thanks!
The text was updated successfully, but these errors were encountered: