Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Second stage training with smaller window size #228

Open
meng2468 opened this issue Apr 11, 2024 · 1 comment
Open

Second stage training with smaller window size #228

meng2468 opened this issue Apr 11, 2024 · 1 comment

Comments

@meng2468
Copy link

I've trained a model from scratch with batch size: 8 and window size: 500 on 4xa10 GPUs. Entering the second phase of training I'm getting the following error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB. GPU
Is there any way I can salvage the model trained in the first stage?

@78Alpha
Copy link

78Alpha commented Apr 21, 2024

You could do with CPU memory fallback. For windows it is a setting in control panel and allows for a total of 24 GB VRAM + 24 GB Shared. If you are on Linux you will have to find your environment's way of turning it on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants