当前位置:网站首页>Solve 1. tensorflow runs using CPU but not GPU 2. GPU version number in tensorflow environment 3. Correspondence between tensorflow and cuda and cudnn versions 4. Check cuda and cudnn versions

Solve 1. tensorflow runs using CPU but not GPU 2. GPU version number in tensorflow environment 3. Correspondence between tensorflow and cuda and cudnn versions 4. Check cuda and cudnn versions

2022-08-09 10:45:00 Fuzzy Pack

This article mainly addresses the following three questions,After ten minutes to understand,Any similar problem is the same:

  1. 查看tensorflow环境下的GPU版本号
  2. 解决tensorflow运行使用CPU不使用GPU
  3. tensorflowcuda以及cudnn版本对应问题

solutions to these three problems,To solve the problem from the beginning.
并且We assume you will use Anacondavirtual environment and installationtensorflow-gpu,也会安装CUDACUDNN
安装Anadonda
安装tensorflow-gpu
安装CUDA和CUDNN

问题一:查看版本

查看CUDACUDNN版本

# 查看cuda版本
$  nvcc -V
$output ==>
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
# 查看cudnn版本
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
$output ==>
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
'''7.6.5'''

重点来了:查看tensorflow环境下的GPUThe version number is as follows to view your current environment(The virtual environment may be this environment)

$ ipython
In [1]: import tensorflow as tf
In [2]: gpu_device_name = tf.test.gpu_device_name()
'''output==>'''
.............省略..........................
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:27:00.0

问题二:GPU不工作

如果你的tensorflow不能使用GPU只能使用CPU(注解:How to know how to use itcpu,通过NVIDIA-SMIThere is no load on the line)
In fact, this time you will find through the above method,You will find these words under your graphics card:

2019-12-29 12:10:23.761412: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2019-12-29 12:10:23.761455: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2019-12-29 12:10:23.761493: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2019-12-29 12:10:23.761532: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2019-12-29 12:10:23.761571: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2019-12-29 12:10:23.761609: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2019-12-29 12:10:23.764661: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-12-29 12:10:23.764728: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...

All these files are not found,所以才会使用gpu而不是cpu

解决方法

注意:If not foundso.9.0就更改为so.9.0,Other versions have been changed.
We can change the soft link as follows:

  1. libcudart
# cuda是cuda-版本的软链接,下面一样
sudo ln -s /usr/local/cuda/lib64/libcudart.so.10.1 /usr/local/cuda/lib64/libcudart.so.10.0
  1. libcufft
sudo ln -s /usr/local/cuda/lib64/libcufft.so.10.1.168 /usr/local/cuda/lib64/libcufft.so.10.0
  1. libcurand
sudo ln -s /usr/local/cuda/lib64/libcurand.so.10.1.168 /usr/local/cuda/lib64/libcurand.so.10.0
  1. libcusolver
sudo ln -s /usr/local/cuda/lib64/libcusolver.so.10.1.168 /usr/local/cuda/lib64/libcusolver.so.10.0
  1. libcusparse
sudo ln -s /usr/local/cuda/lib64/libcusparse.so.10.1.168 /usr/local/cuda/lib64/libcusparse.so.10.0
  1. libcublas
# 这里10.1版本是这样的
sudo ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10.2.0.168 /usr/local/cuda/lib64/libcublas.so.10.0

注意:这里如果是10.0版本及以下的,如果/usr/lib/x86_64-linux-gnu/目录下没有libcublas库,可以在/usr/local/cuda10.1/targets/x86_64-linux/lib/查找libcublas库.

问题三:匹配问题

we pass the question1就发现:如果不匹配,就会出现报错,This error is often the cause of a mismatch,It will also tell you what the version corresponds to.You can solve this problem~

原网站

版权声明
本文为[Fuzzy Pack]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/221/202208091041530329.html