-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PEFT on Gaudi2C] speed of Full-parameter Finetuning is almost equal to that of LoRA #952
Comments
can you attach training logs to ease analysis? |
Sure, I am asking for customer's feedback. |
@intelyoungway, the attached script is also doing LoRA finetuning. Would you clarify what is the exact issue/request? |
customer said they modified the original LoRA script to do finetune (see in the attached files). |
@intelyoungway, Thanks for the comment. From what you said, the goal is to compare the full parameter model fine-tuning with Lora fine-tuning. As per the original Lora paper from Microsoft, https://arxiv.org/abs/2106.09685, it's theoretically understood that full parameter and Lora fine-tuning should not yield the same performance, mainly when low ranks are used in Lora. The disparity in the number of parameters used for training is a key factor here. From the attached script, I see you are using the same For me or anyone else to be able to help, I need more details, especially log files, number of parameters, etc. |
Thanks for the explanation. I think this can fulfill the need. The issue should be closed now. |
Feature request
tmp_finetune.zip
Motivation
Customer found that, finetuning of full-parameter is 14 train-samples-per-second, which is similar to that of LoRA (16 train-samples-per-second).
Please see details in feature request and check if there is any possible way to optimize LoRA for better performance.
The text was updated successfully, but these errors were encountered: