Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run with Local LLM Models #25

Open
IntelligenzaArtificiale opened this issue Apr 29, 2023 · 14 comments
Open

Run with Local LLM Models #25

IntelligenzaArtificiale opened this issue Apr 29, 2023 · 14 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@IntelligenzaArtificiale
Copy link
Owner

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

@IntelligenzaArtificiale IntelligenzaArtificiale added help wanted Extra attention is needed question Further information is requested labels Apr 29, 2023
@wingeva1986
Copy link

Does this project support third-party OpenAI interfaces (such as poe.com)? If it does, are there any other requirements for these interfaces, such as message format, context memory, and number of conversations?

@IntelligenzaArtificiale
Copy link
Owner Author

@wingeva1986 Previously this repository was based on the API provided by xtkkey/gpt4free . The problem was that (rightly so) some API went down every day. And our repository was flooded with issues not related to the project but to the cracked xtekky API. At the moment the solution based on Free and Legal calls to chat.openai.com is the most stable solution.

You could try to apply reverse engineering to sites or portals in a legal way. For example HuggingChat is a free service and open to all. It would be interesting to find the huggingChat endpoint and integrate it into the project.

@HirCoir
Copy link

HirCoir commented May 3, 2023

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/blob/main/ggml-vicuna-7b-1.1-q4_0.bin

@HirCoir
Copy link

HirCoir commented May 3, 2023

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.

we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.

Any advice for LLM models with fine tuning for high performance instructions?

We can't require llama models to be as competitive as GPT, keep in mind that the response depends on the number of parameters of the trained model... I've tried many models in my language, and they all generate stupid responses, like the GPT4ALL model based on parrot, alpaca. I have tested the Vicuna 13b Quantized model and let me tell you that despite having a weight of 4 GB, it is capable of maintaining a fluent conversation and consuming less resources... I am running it on a 4-core ARM Ampere server, with 32GB of RAM and it uses more CPU than RAM and is able to respond correctly. I also managed to implement it to a WhatsApp chat using the Bayleis library.
If you are interested in testing the model, I could give you access to my server so you can try... It's not spam but look for the videos on youtube putting my name and you will find a tutorial where I put Llama.cpp and Alpaca.cpp to the test on two servers with the same hardware.

I made this answer using the translator, my native language is Spanish.

@Therealkorris
Copy link

Have you tested mosaicml/mpt-7b-chat, or mosaicml/mpt-7b-instruct? Seems promising

@IntelligenzaArtificiale
Copy link
Owner Author

@Therealkorris We haven't tried it yet but we believe that mpt-7b-instruct and Lamini-gpt can give better results than other opensource models.

Have you already managed to implement a pipeline to generate text with mpt-7b-instruct ? If yes, what hardware do you have? Do you want to share your Pipeline?

@IntelligenzaArtificiale
Copy link
Owner Author

We tried many local models like LLAMA, VICUNA, OPENASSIST, GPT4ALL in their 7b versions. None seem to give results like the CHATGPT API.
we would like to try to test new models, which can be loaded in a maximum of 16gm of RAM, to allow accessibility to anyone without discrimination.
Any advice for LLM models with fine tuning for high performance instructions?

https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/blob/main/ggml-vicuna-7b-1.1-q4_0.bin

@HirCoir Have you already implemented a pipeline to generate the text? What hardware does it run on?

@sambickeita
Copy link

What do you think about : Cerebras

https://huggingface.co/cerebras

@GoZippy
Copy link

GoZippy commented May 17, 2023

Any other LLM model support? Trying to use new mega13b

@IntelligenzaArtificiale
Copy link
Owner Author

@GoZippy @wingeva1986 @Therealkorris @HirCoir We all know more or less open source models. The problem is that a new one comes out every day. Most lack the performance of GPT3 .

if you want to help us, share here the code to implement an inference with the models you recommend, so that we can test them easily.

for example , @GoZippy , share us your code that you use to do the inference on the mega13b model.

So we create a custom llm wrapper with langchain and run Autogpt , if it gives good results we upload everything to the repository ❤

thanks for the help

@prehcp
Copy link

prehcp commented May 19, 2023

@Tempaccnt
Copy link

currently, Starling is the best 7B model to date:
https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF

@GoZippy
Copy link

GoZippy commented Apr 3, 2024

Any progress on this? I'll be home shortly and will look into this again but have been using other tools as of late... I lost track of where autogpt was going with all the forge stuff... A year ago...

@Tempaccnt
Copy link

I'm the same, I have been too busy so I stopped keeping up. but recently, I found an AI agent called evo.ninja it has workspace and great interface and currently it's ranked as the top autoGPT agent. unfortunately, it requires OpenAI API.

so I looked into alternatives and this is how I came here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

8 participants