当前位置：网站首页>PyTorch 18. torch. backends. cudnn

PyTorch 18. torch. backends. cudnn

2022-04-23 07:29:00 【DCGJ666】

PyTorch 18. torch.backends.cudnn

Written in the beginning
Rules
Background knowledge

Written in the beginning

torch.backends.cudnn.benchmark=True

When this parameter is set to True when , The network will initially look for the convolution calculation method that is most suitable for the current network operation , It can improve the training efficiency of the network ; But when the input image size changes constantly , Using this parameter will slow down the network training speed .

Rules

If the input data dimension or type of the network changes little （ That is, the same data size when initializing the input data ）, Setting this parameter can increase the operation efficiency ;

If the input data of the network is in each iteration If everything changes , It can lead to cuDNN Find the optimal configuration every time , This will reduce the operation efficiency .

Background knowledge

cuDNN

cuDNN NVIDIA is specially developed for deep neural network GPU Acceleration Library , For convolution 、 Many low-level optimizations have been made for common operations such as pooling , More than average GPU The program is much faster . Most mainstream deep learning frameworks support cuDNN. In the use of GPU When ,PyTorch Will be used by default cuDNN Speed up . however , In the use of cuDNN When ,torch.backends.cudnn.benchmark The model is False.

Convolution operation

Convolution layer is the most important part of convolution neural network , It is also the part with the largest amount of computation , If we can improve the efficiency of convolution in low-level code , Without changing the given neural network architecture , Improve the speed of online training .

The implementation methods of convolution are ：

direct method , Loop multi-layer nesting
GEMM（General Matrix Multiply）
FFT( The fast Fourier transform ), First convert to frequency domain , Multiply and convert to time domain
be based on Winograd Algorithm
…

torch.backends.cudnn.benchmark

stay PyTorch The convolution layer in the model is optimized in advance , That is to test in every convolution layer cuDNN The convolution implementation algorithm provided , Then choose the fastest one . So when the model starts , Just take a little extra preprocessing time , The training time can be greatly reduced .

Placement position

if args.use_gpu and torch.cuda.is_available():
    device = torch.device('cuda')
    torch.backends.cudnn.benchmark = True
else:
    device = torch.device('cpu')