CUDA版本是11.1,虚拟环境中安装的torch是2.0.0,首先调用torch出现问题ImportError: libcupti.so.11.7:cannot open shared object file: No such file or directory
该问题的解决办法如下:
检查是否存在该文件
locate libcupti.so.11.7
在该虚拟环境中重新安装torch,还是没生成nvidia这个文件。
从其他的文件中将文件拷贝到虚拟环境的对应包下。
解决上述问题后又出现问题RuntimeError: GET was unable to find an engine to execute this computation
该问题解决方法:
检测torch和cuda是否能用
import torch
print(torch.__version__)
print(torch.cuda.is_available())
测试
import torch
print(torch.cuda.is_available())
num_gpu =1
# Decide which device to run on
device = torch.device("cuda:0" if (torch.cuda.is_available() and num_gpu > 0) else "cpu")
print(device)
print(torch.cuda.get_device_name(0))
print(torch.rand(3,3).cuda())
检查cudnn是否可用
print(torch.backends.cudnn.version())
官网下载cuda
CUDA官网下载
cuda安装
下载解压cudnn
Local Installers for Windows and Linux, Ubuntu(x86_64, armsbsa)
安装cudnn
设置环境
export PATH=/opt/xxx/soft/cuda-11.7/bin:$PATH
export LD_LIBRARY_PATH=LD_LIBRARY_PATH:/opt/xxx/soft/cuda-11.7/lib64
export CUDA_HOME=/opt/xxx/soft/cuda-11.7
运行代码,问题解决
参考资料
RuntimeError: GET was unable to find an engine to execute this computation