Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepRec support cuda 10 #857

Open
welsonzhang opened this issue May 16, 2023 · 1 comment
Open

DeepRec support cuda 10 #857

welsonzhang opened this issue May 16, 2023 · 1 comment

Comments

@welsonzhang
Copy link

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 20.04): Ubuntu 16.04
  • DeepRec version or commit id: main
  • Python version: python3.7
  • Bazel version (if compiling from source): 0.26.1
  • GCC/Compiler version (if compiling from source): gcc version 7.5.0
  • CUDA/cuDNN version: cuda 10.2 cudnn 7.6

Describe the problem
Under an environment of CUDA version 10.2 and cuDNN version 7.6, when compiling DeepRec, an error stating "No such file or directory" occurs, providing the specific details as follows:

/root/.cache/bazel/_bazel_root/e5dd34e735e9b22c055e30807c86bf9e/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/core/_objs/embedding_gpu/gpu_hash_table.cu.pic.d (No such file or directory)
In file included from tensorflow/core/framework/embedding/gpu_hash_table.cu.cc:25:0:
external/cuCollections/include/cuco/dynamic_map.cuh:21:23: fatal error: cub/cub.cuh: No such file or directory 

Provide the exact sequence of commands / steps that you executed before running into the problem

  1. install cuda and cudnn
wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run

https://developer.nvidia.com/rdp/cudnn-archive
tar -xzvf cudnn-xxx.tar.gz
sudo cp cuda/include/* /usr/local/cuda/include
sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
  1. bazel build DeepRec
bazel build -c opt --config=opt //tensorflow:libtensorflow_cc.so

Any other info / logs

Include any logs or source code that would be helpful to diagnose the problem.

@liutongxuan
Copy link
Member

Good catch, currently DeepRec CICD is built on CUDA11, we haven't catch up the compatibility issue in CUDA 10.

Anyway, we suggest you upgrade to CUDA11 or CUDA12 (which also supported in DeepRec), which perform better performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants