当前位置:网站首页>Use 3080ti to run tensorflow GPU = 1 X version of the source code

Use 3080ti to run tensorflow GPU = 1 X version of the source code

2022-04-23 20:48:00 NuerNuer

Environmental Science :Anaconda3,ubuntu18.04,RTX3080ti,python3.7

RTX3080ti Graphics cards are relatively new on the market , Graphics card with strong computing power , And This kind of graphics card adopts ampere architecture and only supports cuda11.x And above . About tf-gpu,cuda,cudnn Correspondence between , We can see : Build from source  |  TensorFlow
About drive and cuda,cudnn Correspondence of , We can see :Release Notes :: CUDA Toolkit Documentation
 

## problem 1: my 30 The driver of the series card is 450.x.x Or higher , According to the principle of downward compatibility , Can I use cuda10.x?

The normal answer is No , Because normally 30 Series cards only support cuda11.x( There are also abnormal answers , But it's troublesome to install , Here we only discuss the normal situation , A link to this situation is given below ). This problem is also a problem that puzzles me , Because I have always adhered to the principle of downward compatibility , Believe me 3080ti Support cuda10.x, I found in several tests , It won't work . We all know , In the use of conda install tf-gpu When , Will automatically install the corresponding cudatoolkit and cudnn, This has really saved us a lot of trouble . However, during installation, it was found that conda Sure search To the highest tensorflow-gpu Version is 2.4.1, And the corresponding cudatoolkit yes 10.1.183, This way to install a good environment , In the use of tf.test.is_gpu_avaliable() When you print out True, But when actually running code , Will get stuck in a strange place :

Then report the bottom line error , You can see ,cuda and cudnn Although it opens normally , But there's no way to use , Because of 30 Series cards only support cuda11.x, This means less than tf-gpu=2.x This version of the code cannot run ,tf-gpu=1.x The version of can't . How to solve it , Keep looking at the problem 2 and 3.
PS: Abnormal conditions , because Geogle No more maintenance tf1.x, however Nvidia The company is maintaining a tf-gpu=1.15.x, And can run in 30 Series of cards , This method requires scientific Internet access and specified ubuntu edition (ubuntu20.x), So I didn't try , You can have a try if you are interested , link :
https://blog.csdn.net/wu496963386/article/details/109583045

## problem 2: In my virtual environment tensorflow-gpu If not used conda Automatically installed cudatoolkit and cudnn, Direct use of the environment CUDA and cudnn( The premise is that your environment has been configured )?

The answer is yes , because conda Sure search To the highest tensorflow-gpu Version is 2.4.1, And the corresponding cudatoolkit yes 10.1.183, our 30 The card cannot be used directly . Because of the tf-gpu and cudatoolkit,cudnn It's bundled , And the uninstallation is bundled , So we Out of commission conda Come and pretend tf-gpu 了 . Original conda Installed tf-gpu uninstall , Will be uninstalled together cudatoolkit and cudnn. If not in the environment cudatoolkit and cudnn, Will use... In the big environment cuda and cudnn, Use pip Install... In a virtual environment tf-gpu, Because of my environment cuda Version is 11.2, So I downloaded tf-gpu=2.6.1. Many students will ask , But when my code 1.x What should I do , Then look at the problem 3.

## problem 3:tf2.x and tf1.x comparison , Many functions have changed , Also abandoned a lot of functions , What do I do ?

According to the question 1, We already know ,tf-gpu=1.x The normal version cannot be used in 30 Running on a series of cards , If you want to adjust the graphics card, you must use tf-gpu=2.x, So we need to modify the source code , To make it in tf-gpu=2.x Run under , The main changes involved :

1. stay import tensorflow When using :

import tensorflow.compat.v1 as tf ###
tf.disable_v2_behavior()

This way, import Of tf It doesn't contain contrib, because tf2.x Discarded this bag

2. Some uses contrib Initialization method and function of package :

tf1.x:   tf.contrib.layers.xavier_initializer()-->

tf2.x:   tf.keras.initializers.glorot_normal()

tf1.x:   tf.contrib.layers.l2_regularizer(0.01)) -->

tf2.x:   tf.keras.regularizers.l2(0.01)) 

  When you are revising , Modify according to the error report , The changes should not involve too much .


Finally, let's talk about , My current configuration :Nvidia-driver:465.31,tf-gpu=2.6.1,cuda=11.2(11.2.142), cudnn=8.1.1

Finally, there's another question : according to Build from source  |  TensorFlow Found here tf-gpu=2.4.0 It should match cuda=11.0 Of , Why? conda Inside will match cudatoolkit=10.1.x Well ? It should be that the website gives the collocation that is known and can be used , Not all collocations , Want to try conda This combination is not feasible , have access to 20 Series video card , Installation collocation cuda10.1 The driver .

版权声明
本文为[NuerNuer]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204210545522749.html