Skip to content
This repository has been archived by the owner on Sep 30, 2023. It is now read-only.

What would it take to train it for other languages? #16

Open
enboig opened this issue Apr 19, 2023 · 1 comment
Open

What would it take to train it for other languages? #16

enboig opened this issue Apr 19, 2023 · 1 comment

Comments

@enboig
Copy link

enboig commented Apr 19, 2023

I use PHP, could this be added?

@ravenscroftj
Copy link
Owner

ravenscroftj commented Apr 22, 2023

Hey so basically the main things that would be needed are

a) giant dataset of PHP code - the original codegen models are trained on the Github Bigquery Dataset and there is a subset for PHP so potentially that would be a good source

b.i) a large enough machine to fine tune the model - I'm a hobbyist and in order to train a model with billions of parameters you need commercial GPUs (even on a 4090ti you would struggle to fine tune the 2B or larger models).

OR

b.ii) Use a low resource fine tuning pattern like LoRA which adapts the model by adding additional layers to the model that are specialised for the new language you want to target. This changes the shape/architecture of the model making it incompatible with the current implementation of codegen in ggml so I would need to make some changes to the cpp. llama.cpp recently added LoRa support so it is feasible that lora support could be added.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants