-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One error should be solved and one improvement for reducing the CUDA memory #22
Labels
good first issue
Good for newcomers
Comments
Just a reminder. My env is V100, gcc-7, cuda11.7, python3.10 |
Nice work! |
Without changine the code, 4090 will run into a CUDA memory overflow error So, Thank you! |
wttc-nitr
added a commit
to wttc-nitr/mPLUG-Owl
that referenced
this issue
May 7, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
1.One error should be solved
when install apex, there will be 4 erors about "convert unsigned long to long", you need to edit:
(1) line 65 in apex_22.01_pp/csrc/mlp.cpp
auto reserved_space = at::empty({reserved_size}, inputs[0].type());
change to:
auto reserved_space = at::empty({static_cast<long>(reserved_size)}, inputs[0].type());
(2) line 138 in apex_22.01_pp/csrc/mlp.cpp
auto work_space = at::empty({work_size / sizeof(scalar_t)}, inputs[0].type());
change to:
auto work_space = at::empty({static_cast<long>(work_size / sizeof(scalar_t))}, inputs[0].type());
or you need to change the compile option
2.one improvement for reducing the CUDA memory
when launch the owl_demo.py using a GPU with 16G, I ran into a CUDA memory overflow error. Then I edit here:
line 33 and 34 in interface.py:
change to:
Then, After the demo is started, the memory usage is about 14 GB. It can run very well on a 16GB GPU.
The text was updated successfully, but these errors were encountered: