Second stage training with smaller window size #228

meng2468 · 2024-04-11T05:57:38Z

I've trained a model from scratch with batch size: 8 and window size: 500 on 4xa10 GPUs. Entering the second phase of training I'm getting the following error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB. GPU
Is there any way I can salvage the model trained in the first stage?

The text was updated successfully, but these errors were encountered:

78Alpha · 2024-04-21T02:11:11Z

You could do with CPU memory fallback. For windows it is a setting in control panel and allows for a total of 24 GB VRAM + 24 GB Shared. If you are on Linux you will have to find your environment's way of turning it on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Second stage training with smaller window size #228

Second stage training with smaller window size #228

meng2468 commented Apr 11, 2024

78Alpha commented Apr 21, 2024

Second stage training with smaller window size #228

Second stage training with smaller window size #228

Comments

meng2468 commented Apr 11, 2024

78Alpha commented Apr 21, 2024