Run with Local LLM Models #25

IntelligenzaArtificiale · 2023-04-29T22:03:00Z

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

wingeva1986 · 2023-04-30T08:09:14Z

Does this project support third-party OpenAI interfaces (such as poe.com)? If it does, are there any other requirements for these interfaces, such as message format, context memory, and number of conversations?

IntelligenzaArtificiale · 2023-04-30T10:41:53Z

@wingeva1986 Previously this repository was based on the API provided by xtkkey/gpt4free . The problem was that (rightly so) some API went down every day. And our repository was flooded with issues not related to the project but to the cracked xtekky API. At the moment the solution based on Free and Legal calls to chat.openai.com is the most stable solution.

You could try to apply reverse engineering to sites or portals in a legal way. For example HuggingChat is a free service and open to all. It would be interesting to find the huggingChat endpoint and integrate it into the project.

HirCoir · 2023-05-03T16:56:33Z

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/blob/main/ggml-vicuna-7b-1.1-q4_0.bin

HirCoir · 2023-05-03T17:02:07Z

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

We can't require llama models to be as competitive as GPT, keep in mind that the response depends on the number of parameters of the trained model... I've tried many models in my language, and they all generate stupid responses, like the GPT4ALL model based on parrot, alpaca. I have tested the Vicuna 13b Quantized model and let me tell you that despite having a weight of 4 GB, it is capable of maintaining a fluent conversation and consuming less resources... I am running it on a 4-core ARM Ampere server, with 32GB of RAM and it uses more CPU than RAM and is able to respond correctly. I also managed to implement it to a WhatsApp chat using the Bayleis library.
If you are interested in testing the model, I could give you access to my server so you can try... It's not spam but look for the videos on youtube putting my name and you will find a tutorial where I put Llama.cpp and Alpaca.cpp to the test on two servers with the same hardware.

I made this answer using the translator, my native language is Spanish.

Therealkorris · 2023-05-07T23:18:41Z

Have you tested mosaicml/mpt-7b-chat, or mosaicml/mpt-7b-instruct? Seems promising

IntelligenzaArtificiale · 2023-05-08T12:27:50Z

@Therealkorris We haven't tried it yet but we believe that mpt-7b-instruct and Lamini-gpt can give better results than other opensource models.

Have you already managed to implement a pipeline to generate text with mpt-7b-instruct ? If yes, what hardware do you have? Do you want to share your Pipeline?

IntelligenzaArtificiale · 2023-05-08T12:30:47Z

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.
we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.
Any advice for LLM models with fine tuning for high performance instructions?

https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/blob/main/ggml-vicuna-7b-1.1-q4_0.bin

@HirCoir Have you already implemented a pipeline to generate the text? What hardware does it run on?

sambickeita · 2023-05-08T21:40:19Z

What do you think about : Cerebras

https://huggingface.co/cerebras

GoZippy · 2023-05-17T00:59:42Z

Any other LLM model support? Trying to use new mega13b

IntelligenzaArtificiale · 2023-05-17T12:40:11Z

@GoZippy @wingeva1986 @Therealkorris @HirCoir We all know more or less open source models. The problem is that a new one comes out every day. Most lack the performance of GPT3 .

if you want to help us, share here the code to implement an inference with the models you recommend, so that we can test them easily.

for example , @GoZippy , share us your code that you use to do the inference on the mega13b model.

So we create a custom llm wrapper with langchain and run Autogpt , if it gives good results we upload everything to the repository ❤

thanks for the help

prehcp · 2023-05-19T00:46:14Z

https://github.com/oobabooga/text-generation-webui

Tempaccnt · 2024-04-03T00:33:13Z

currently, Starling is the best 7B model to date:
https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF

GoZippy · 2024-04-03T02:21:02Z

Any progress on this? I'll be home shortly and will look into this again but have been using other tools as of late... I lost track of where autogpt was going with all the forge stuff... A year ago...

Tempaccnt · 2024-04-03T02:27:58Z

I'm the same, I have been too busy so I stopped keeping up. but recently, I found an AI agent called evo.ninja it has workspace and great interface and currently it's ranked as the top autoGPT agent. unfortunately, it requires OpenAI API.

so I looked into alternatives and this is how I came here

IntelligenzaArtificiale added help wanted Extra attention is needed question Further information is requested labels Apr 29, 2023

IntelligenzaArtificiale added this to the New release 🎁👨‍💻 milestone Apr 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run with Local LLM Models #25

Run with Local LLM Models #25

IntelligenzaArtificiale commented Apr 29, 2023

wingeva1986 commented Apr 30, 2023

IntelligenzaArtificiale commented Apr 30, 2023

HirCoir commented May 3, 2023

HirCoir commented May 3, 2023 •

edited

Therealkorris commented May 7, 2023

IntelligenzaArtificiale commented May 8, 2023

IntelligenzaArtificiale commented May 8, 2023

sambickeita commented May 8, 2023

GoZippy commented May 17, 2023

IntelligenzaArtificiale commented May 17, 2023

prehcp commented May 19, 2023

Tempaccnt commented Apr 3, 2024

GoZippy commented Apr 3, 2024

Tempaccnt commented Apr 3, 2024

Run with Local LLM Models #25

Run with Local LLM Models #25

Comments

IntelligenzaArtificiale commented Apr 29, 2023

wingeva1986 commented Apr 30, 2023

IntelligenzaArtificiale commented Apr 30, 2023

HirCoir commented May 3, 2023

HirCoir commented May 3, 2023 • edited

Therealkorris commented May 7, 2023

IntelligenzaArtificiale commented May 8, 2023

IntelligenzaArtificiale commented May 8, 2023

sambickeita commented May 8, 2023

GoZippy commented May 17, 2023

IntelligenzaArtificiale commented May 17, 2023

prehcp commented May 19, 2023

Tempaccnt commented Apr 3, 2024

GoZippy commented Apr 3, 2024

Tempaccnt commented Apr 3, 2024

HirCoir commented May 3, 2023 •

edited