Report on LLaMa-3 and b2710 #6824

wro52 · 2024-04-22T09:13:33Z

wro52
Apr 22, 2024

Using b2710 in combination with LLaMa3 b8 instruct conversion testing on Linux Mint with these settings went very promising:

Based on original META LLaMa-3.

#-------------------- Conversion to f16 with ----------------

python3 convert.py ./models/Meta-Llama-3-8B-Instruct --outtype f16 --vocab-type bpe

#------------------- Quantize with
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q4_0.gguf Q4_0
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q5_0.gguf Q5_0
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q8_0.gguf Q8_0
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q4_k_m.gguf Q4_K_M
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q4_k_s.gguf Q4_K_S
./quantize ./models/Meta-Llama-3-8B-Instruct/ggml-model-f16.gguf ./models/Meta-Llama-3-8B-Instruct/Meta_3_8B_chat_f16_V4_q5_k_m.gguf Q5_K_M

Thank you all for the remarkable work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report on LLaMa-3 and b2710 #6824

{{title}}

Replies: 0 comments

Select a reply

Report on LLaMa-3 and b2710 #6824

wro52 Apr 22, 2024

Replies: 0 comments

wro52
Apr 22, 2024