Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]During the process of the train, it occurs the problem of OOM #661

Open
shanyun123456 opened this issue Oct 14, 2021 · 7 comments
Open
Labels
bug Something isn't working

Comments

@shanyun123456
Copy link

Description

Hi, @soumyadeepdey
When I use your code to train a model in gpu, it seems always occer the problem of OOM.
The batchsize is 1, other parameters haven't changed.
Please check it,thanks

In which platform does it happen?

linux gpu

How do we replicate the issue?

You can use the command python3 sample_train.py, you can replicate the issue.

Expected behavior (i.e. solution)

Other Comment

截屏2021-10-14 下午2 18 37

s

截屏2021-10-14 下午2 19 49

@shanyun123456 shanyun123456 added the bug Something isn't working label Oct 14, 2021
@soumyadeepdey
Copy link
Contributor

soumyadeepdey commented Oct 16, 2021 via email

@shanyun123456
Copy link
Author

Hi,
the memory size of my gpu is about 29g
It's V100 gpu

@soumyadeepdey
Copy link
Contributor

soumyadeepdey commented Oct 19, 2021 via email

@dongcin
Copy link

dongcin commented Mar 28, 2022

I also meet the issue

@soumyadeepdey
Copy link
Contributor

soumyadeepdey commented Mar 28, 2022 via email

@Kdjhsa
Copy link

Kdjhsa commented May 5, 2024

I meet the issue too

@soumyadeepdey
Copy link
Contributor

soumyadeepdey commented May 5, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants