Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf : use Qn_K for k-quants instead of KQn #837

Merged
merged 1 commit into from
May 24, 2024

Conversation

compilade
Copy link
Contributor

#822 (by @mofosyne) has introduced a naming convention for GGUF model files, but the way it names k-quants doesn't follow the established practice (all other places where k-quants are named use Qn_K where n is the number of bits per weight excluding the scales).

rg -i 'KQ\d' doesn't return anything related to quants except for this recently-added section, while
rg -i 'Q\d_K' returns a lot of things related to k-quants when run in ggml and llama.cpp repos

So this renames KQ2 to Q2_K, for consistency. This should avoid unnecessary confusion.

(note that the recently-added wiki page about "tensor encoding schemes" will need to be updated too, since it is the only other place I found to also use this KQ<X> naming scheme)

@ggerganov ggerganov merged commit 8d6b703 into ggerganov:master May 24, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants