You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Using b2710 in combination with LLaMa3 b8 instruct conversion testing on Linux Mint with these settings went very promising:
#----------------- Test b2710 ------------------------------
./main -t 4 -m ./models/7B/Meta_3_8B_chat_f16_V4_q4_0.gguf --log-enable --color -c 8192 --temp 0.7 --mirostat 2 --repeat-penalty 1.1 -n -1 -i --in-prefix "" --in-suffix "" -r "<|eot_id|>" -p "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful, respectful and honest assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWho are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
Based on original META LLaMa-3.
#-------------------- Conversion to f16 with ----------------
python3 convert.py ./models/Meta-Llama-3-8B-Instruct --outtype f16 --vocab-type bpe
#------------------- Quantize with
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q4_0.gguf Q4_0
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q5_0.gguf Q5_0
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q8_0.gguf Q8_0
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q4_k_m.gguf Q4_K_M
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q4_k_s.gguf Q4_K_S
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q5_k_m.gguf Q5_K_M
Thank you all for the remarkable work.
Beta Was this translation helpful? Give feedback.
All reactions