llamafile as LLM server. #277

amonpaike · 2024-05-12T20:28:37Z

Unfortunately koboldcpp with cuda crashes on my pc because my processor doesn't support avx2, while the other "blas" are too slow. So as an alternative i use llamafile, is working nice and smart, is very light and very performing on my 3060 with 12gb. The only problem is that every time I have to start a conversation, in order for the llm to generate the response, I have to briefly "alt+tab" to "exit and re-enter the game" so that llamafile generates the response and it triggers the loop with speech, it also works for multiple comments, but then after it asks a new question, I have to "alt+tab" again to trigger the llm. I was wondering what it could be and if there is a way to overcome this problem.

amonpaike mentioned this issue May 21, 2024

llamafile as LLM server for Mantella mod and Skyrim, is working nice but there is a little problem. Mozilla-Ocho/llamafile#415

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llamafile as LLM server. #277

llamafile as LLM server. #277

amonpaike commented May 12, 2024 •

edited

llamafile as LLM server. #277

llamafile as LLM server. #277

Comments

amonpaike commented May 12, 2024 • edited

amonpaike commented May 12, 2024 •

edited