Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_lora_clm.py support for other datasets #629

Closed
tmabraham opened this issue Jan 9, 2024 · 2 comments · Fixed by #955
Closed

run_lora_clm.py support for other datasets #629

tmabraham opened this issue Jan 9, 2024 · 2 comments · Fixed by #955

Comments

@tmabraham
Copy link

tmabraham commented Jan 9, 2024

Feature request

right now the script is hardcoded for either "tatsu-lab/alpaca" or "timdettmers/openassistant-guanaco" and using any other dataset throws an error.

Motivation

It would be nice to be able to finetune on our own datasets.

Your contribution

Happy to test any code out for this...

@regisss
Copy link
Collaborator

regisss commented Jan 9, 2024

Yes, I agree that would be much better.
Can you share a command line that fails? And the error message you get please?

@dmsuehir
Copy link
Contributor

dmsuehir commented May 6, 2024

I posted PR #955 that fixes this issue and will allow other datasets to be used with run_lora_clm.py. I tried it with several different datasets to check the functionality, but please let me know if there's a specific one that you'd like me to try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants