Where can I learn about the different models, types of models and fine-tuning? #423
dportabella
started this conversation in
General
Replies: 2 comments 1 reply
-
I have NVDIA GeForce RTX 4070 with 12GB. The best models I can use are the 7B GPTQ 8bit models, right?
|
Beta Was this translation helpful? Give feedback.
0 replies
-
@dportabella hi, what is the response time for the best model you have tried till now on your GeForce 12GB? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have some books in PDF and I want to query the books using localGPT.
I have a machine with proxmox (server virtualization). Inside proxmox, I have a VM with 24 GB mem, 20 processors (1 socket, 20 cores), and NVDIA GeForce RTX 4070 with 16 GB VRAM (directly attached from host to the VM). Is this a good setup for optimal and fast replies?
Which is the best powerful model available for this task?
Where can I learn about the different models and fine-tuning?
"TheBloke/Llama-2-7B-Chat-GGML"
"TheBloke/vicuna-7B-1.1-HF"
"TheBloke/Wizard-Vicuna-7B-Uncensored-HF"
"TheBloke/guanaco-7B-HF"
'NousResearch/Nous-Hermes-13b'
I see that GPTQ models are for small device such as phones, so I don't need those.
Should I use HF models or GGML (quantized cpu+gpu+mps) models?
Beta Was this translation helpful? Give feedback.
All reactions