-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Allocation Issue (QLoRa + Llama3-8B-IT) #1716
Comments
Hmm, hard to say and I can't easily try to reproduce this. Do you already see strange behavior after loading the model, before starting training? If you try without PEFT, do you see the same issue (in case of not having enough memory without PEFT, you could e.g. turn off autograd on most of the layers to "simulate" parameter efficient fine-tuning)? If yes, this could be an accelerate issue. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
System Info
peft: 0.10.1.dev0
accelerate: 0.30.0
bitsandbytes: 0.43.1
transformers: 4.39.3
GPU: A6000 * 2 ( 96GB )
nvidia-driver version: 535.171.04
cuda: 11.8
Who can help?
No response
Information
Tasks
examples
folderReproduction
I was training a Llama3-8B-IT model with QLoRA. I successed a training, but GPU wasn't evenly allocate. Is it a version issue with peft or transformers? Or is it a version issue with the graphics driver? I have experience with learning evenly on previous A100*8 servers, but I don't know if this is an issue in this case.
This is my script.
Expected behavior
I want the GPUs to be evenly allocated.
The text was updated successfully, but these errors were encountered: