Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I have missing CUDA library files that are causing crash when I start torchrun #231

Open
ichibrosan opened this issue May 2, 2024 · 0 comments

Comments

@ichibrosan
Copy link

My operating system is Ubuntu Linux 22.04
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
In order to get CUDA pytorch and CUDA under conda, I am using
Active State Python with the following configuration:
image
I am starting up with:
#!/bin/sh
torchrun --nproc_per_node 1 example_instructions.py
--ckpt_dir CodeLlama-7b-Instruct/
--tokenizer_path CodeLlama-7b-Instruct/tokenizer_model
--max_seq_len 512 --max_batch_size 4
and torchrun is crashing over missing libraries.

Traceback (most recent call last):
File "/home/doug/.cache/activestate/cb772d80/usr/bin/torchrun", line 5, in
import torch.distributed.run
File "/home/doug/.cache/activestate/cb772d80/usr/lib/python3.10/site-packages/torch/init.py", line 191, in
_load_global_deps()
File "/home/doug/.cache/activestate/cb772d80/usr/lib/python3.10/site-packages/torch/init.py", line 153, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/home/doug/.cache/activestate/cb772d80/usr/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: libcufft.so.10: cannot open shared object file: No such file or directory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant