-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slower image encode the lower the quantization #85
Comments
Im having the same issue it takes a very long time to encode images. Im getting an average of 830 ms for my q5_0 model on a rather old(2019, i7) mac. Any information regarding this would be much appreciated :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm running this model clip-vit-base-patch32_ggml on my intel mac and it looks like the lower the quantization the slower image encoding is. I tried the main
clip-vit-base-patch32_ggml-model-f32.gguf
model and theq8_0
andq4_0
variants.These are the encode times I get for a batch of 4 images:
f16
looks like an outlier, taking the most time.But looking at
f32(272.21ms)
->q8_0(333.96ms)
->q5_0(354.86ms)
->q4_0(539.32ms)
, time is getting worse. Its better with the_1
variants though.Anyone know if this expected or is there something wrong?
The text was updated successfully, but these errors were encountered: