Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it support Arabic #10

Open
Qt4arab opened this issue Feb 7, 2024 · 3 comments
Open

Does it support Arabic #10

Qt4arab opened this issue Feb 7, 2024 · 3 comments

Comments

@Qt4arab
Copy link

Qt4arab commented Feb 7, 2024

I have 50k high quality Arabic dataset,is possible to train the model on Arabic?

@sidroopdaska
Copy link
Contributor

See comment here #6

@vatsalaggarwal
Copy link
Contributor

I've added some initial pointers to this here: #70 (comment)

@lucapericlp
Copy link
Contributor

lucapericlp commented Mar 14, 2024

Hey @Qt4arab , we've just published an initial approach for finetuning the last N transformer blocks of the first stage LLM. Best to play around with the hyperparams in finetune_params.py as we didn't determine the optimal set. Let us know if you have any issues or if you're up for contributing any improvements (via param sweep or otherwise!)

Next step to improve finetuning effectiveness is to have LoRA adapters for the first stage LLM which is being worked on here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants