New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use your finetuned model ? #110
Comments
thanks for raising the issue, and apologies about the difficultly you're facing here... that's our bad :) out of curiosity, how much VRAM do you have? @lucapericlp -- just fyi -- https://github.com/metavoiceio/metavoice-src/blob/main/fam/llm/finetune.py#L299, this ends up storing all the tensors including optimiser states in pickled format (and they're on CUDA device), when you load them via one of us will send us out a fix later tn :) |
Does finetuning run on that same GPU though? That's a bit strange because it looks like you do have enough VRAM for weights+optimiser states? |
Hey!! Thanks for your answer! Well yes, I think the finetuning ran on that same GPU because I think I saw in the console that the finetuning was run on GPU = 0 (which I thought, at the time, was the index of the device used) and CUDA was called. AND the training was relatively fast, which would definitely not have been the case on CPU I think I have a RTX 3060 12Gb, here is the exact message have:
I tried a torch.cuda.empty_cache() but to no avail :'( |
yeah, torch.cuda.empty_cache is unlikely to help here... if you've got 15-16 GBs of GPU RAM, you can try add otherwise, the best thing to do might be to remove saving optimiser states during checkpointing ... you basically need to remove this line: https://github.com/metavoiceio/metavoice-src/blob/main/fam/llm/finetune.py#L288 ... but you might have difficultly resuming your finetuning jobs if you ever want to do that... |
Oh yeah there's really no way to resume the finetuning on 12GB VRAM (aka save the optimiser states if I understand correctly)? |
I suppose I have to first delete the line in finetune.py, THEN retrain my model, and then try to call it through inference.py? Is that right? |
Yeah, that's the "easiest"... the way to use your existing training likely is to "add |
Ok this would then make the inference tap into the CPU to compensate for the lack of VRAM is that it? thank you so much for your patience |
it would just mean that the optimiser states remain in CPU RAM instead of living on GPU. Model should still be on GPU. |
OK! let me try both solutions and get back to you later today :) |
Hey again!
thanks! |
sorry about the issues here :( we'll have a look today/tomorrow! |
No worries, thank you for your implication :))) |
hey, sorry about the delay here... i'm looking into this now... could you share your finetuned checkpoint with me please (gdrive or anything is good), and also which GPU you were using? |
Hey! thank you! Of course, I've sent you by email through an external host because it's 5GB ^^ Tell me if you've received! |
Hey @vatsalaggarwal any news on this question? Have you received my checkpoint? Thank you! I'm such in a hurry to start finetuning and try my hand at making a French model! |
@vatsalaggarwal anything worth adding here? |
Hey there! I've originally posted this question inside of a longer thread but I thought it might be best to post it as a new thread because I'm unlikely to be the only one to have this issue.
So I've gone ahead and finetuned my first model, but I don't get how I can use it now. I went to see into the "fast_inference.py" and I saw a checkpoint path at line 88 but I got an OOM when trying to point to my finetuned ckpt. So I'm not sure I'm doing it right (not surprising since there is an entire rig to load the model from Hugging Face so that was kind of a desperate attempt on my part!)
I've also tried changing the checkpoint path in the "inference.py" script, but I have another OOM error.
Thanks!!
The text was updated successfully, but these errors were encountered: