当前位置:网站首页>PyTorch 22. Pytorch common code snippet collection
PyTorch 22. Pytorch common code snippet collection
2022-04-23 07:29:00 【DCGJ666】
PyTorch A collection of common code snippets
- Import package and version query
- Reproducibility
- Video card settings
- Tensor processing
- Multi card synchronization BN(Batch normalization)
- Calculate the overall parameters of the model
- Import the same part of another model into the new model
- Other matters needing attention
Reference resources : https://zhuanlan.zhihu.com/p/104019160
Import package and version query
import torch
import torch.nn as nn
import torchvision
print(torch.__version__)
print(torch.version.cuda)#cuda Version query
print(torch.backends.cudnn.version())#cudnn Version query
print(torch.cuda.get_device_name(0))# Device name
Reproducibility
On hardware device (CPU,GPU) Different time , Complete reproducibility is not guaranteed , Even if the random seeds are the same . however , On the same device , Reproducibility should be guaranteed , The specific way is , At the beginning of the program Fix torch Of random seeds , At the same time numpy The random seeds of .
np.random.seed(0)
torch.manual_seed(0) # stay CPU Set seeds to generate random numbers , So that the result is certain
torch.cuda.manual_seed_all(0)# For all the GPU Set seeds , So that the result is certain
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
Video card settings
If you only need one graphics card :
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
If you need to specify multiple graphics cards , such as 0,1 The graphics card
import os
os.environ['CUDA_VISIBLE_DEVICES']='0,1'
You can also set the graphics card when running code on the command line :
CUDA_VISIBLE_DEVICES=0,1 python train.py
Clear video memory
torch.cuda.empty_cache()
You can also use reset... On the command line GPU Instructions
nvidia-smi --gpu-reset -i [gpu_id]
Tensor processing
The data type of the tensor
PyTorch Yes 9 Kind of CPU Tensor types and 9 Kind of GPU Tensor type
Tensor basic information
tensor = torch.randn(3,4,5)
print(tensor.type()) # data type
print(tensor.size()) # Tensor shape, Is a tuple
print(tensor.dim()) # The number of dimensions
Name tensor
Tensor naming is a very useful method , This makes it easy to use the name of the dimension for indexing or other operations , Greatly improved readability 、 Ease of use , Prevent mistakes .
# stay PyTorch 1.3 Before , You need to use comments
# Tensor [N,C,H,W]
images = torch.randn(32,3,56,56)
images.sum(dim=1)
images.select(dim=1, index=0)
# PyTorch 1.3 after
NCHW = ['N', 'C', 'H', 'W']
images = torch.randn(32, 3, 56, 56, names=NCHW)
images.sum('C')
images.select('C', index=0)
# It can also be set like this
tensor = torch.randn(3,4,1,2,names=('C','N','H','W'))
# Use align_to You can easily sort dimensions
tensor = tensor.align_to('N','C','H','W')
Data type conversion
# Set the default type ,pytorch Medium floatTensor Far faster than DoubleTensor
torch.set_default_tensor_type(torch.FloatTensor)
# Type conversion
tensor = tensor.cuda()
tensor = tensor.cpu()
tensor = tensor.float()
tensor = tensor.long()
torch.Tensor And np.ndarray transformation
except CharTensor, All of the other CPU All tensors on support the transformation to numpy Format and then convert back .
ndarray = tensor.cpu().numpy()
tensor = torch.from_numpy(ndarray).float()
tensor = torch.from_numpy(ndarray.copy()).float() # if ndarray has negative stride
Torch.tensor And PIL.Image transformation
# pytorch The tensor in the defaults to [N,C,H,W] The order of , And the data range is [0,1], It needs to be transposed and normalized
image = PIL.Image.fromarray(torch.clamp(tensor*255, min=0, max=255).byte().permute(1,2,0).cpu().numpy())
image = torchvision.transforms.functional.to_pil_image(tensor)
# PIL.Image -> torch.Tensor
path = r'./figure.jpg'
tensor = torch.from_numpy(np.asarray(PIL.Image.open(path))).permute(2,0,1).float() / 255
tensor = torchvision.transforms.functional.to_tensor(PIL.Image.open(path))
Extract values from a tensor that contains only one element
value = torch.rand(1).item()
Tensor deformation
# When the convolution layer is input into the fully connected layer, the tensor usually needs to be deformed
# comparison torch.view, torch.reshape It can automatically handle the discontinuous input tensor
tensor = torch.rand(2,3,4)
shape=(6, 4)
tensor = torch.reshape(tensor, shape)
Out of order
tensor = tensor[torch.randperm(tensor.size(0))] # Disrupt the first dimension
Copy tensor
# Operation | New/Shared memory | Still in computation graph |
tensor.clone() # | New | Yes |
tensor.detach() # | Shared | No |
tensor.detach.clone()() # | New | No |
Get non-zero elements
torch.nonzero(tensor) # Get the of non-zero elements index
torch.nonzero(tensor==0) # Get a zero element index
torch.nonzero(tensor).size(0) # The number of nonzero elements
torch.nonzero(tensor == 0).size(0) # Number of zero elements
Judge that two tensors are equal
torch.allclose(tensor1, tensor2) #float tensor
torch.equal(tensor1, tensor2) # int tensor
Tensor expansion
# take 64*512 The tensor of is extended to 64*512*7*7
tensor = torch.rand(64, 512)
torch.reshape(tensor, (64, 512, 1, 1)).expand(64, 512, 7, 7)
Multi card synchronization BN(Batch normalization)
When using torch.nn.DataParallel Run the code on multiple GPU On the card ,PyTorch Of BN Layer default operation is to calculate the mean value and standard deviation of data on each card independently , Sync BN Use all the data on the card to calculate BN The mean and standard deviation of layers , Relieved bs Compare the situation when the mean value and standard deviation are not estimated correctly .
sync_bn = torch.nn.SyncBatchNorm(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
All of the existing network BN Change layer to synchronization BN layer
def convertBNtoSyncBN(module, process_group=None):
# Recursively integrate all the in the network BN Replace with SyncBN layer
if isinstance(module, torch.nn.modules.batchnorm._BatchNorm):
sync_bn = torch.nn.SyncBatchNorm(module.num_features, module.eps, module.momentum,
module.affine, module.track_running_stats, process_group)
sync_bn.running_mean = module.running_mean
sync_bn.running_var = module.running_var
if module.affine:
sync_bn.weight = module.weight.clone().detach()
sync_bn.bias = module.bias.clone().detach()
return sync_bn
else:
for name, child_module in module.named_children():
setattr(module, name) = convert_syncbn_model(child_module, process_group = process_group)
return module
affine Defined BN Parameters of the layer γ \gamma γ and β \beta β Is it learnable ( Non learnable default is constant 1 and 0)
Calculate the overall parameters of the model
num_parameters = sum(torch.numel(parameter) for parameter in model.parameters())
Import the same part of another model into the new model
When importing parameters from the model , If the structure of the two models is inconsistent , If you import model parameters directly, an error will be reported , You can import the same part of another model into a new model by using the following method
def load_model(model, model_path, optimizer=None, resume=False,
lr=None, lr_step=None):
start_epoch = 0
checkpoint = torch.load(model_path, map_location=lambda storage, loc: storage)
print('loaded {}, epoch {}'.format(model_path, checkpoint['epoch']))
state_dict_ = checkpoint['state_dict']
state_dict = {
}
# convert data_parallal to model
for k in state_dict_:
if k.startswith('module') and not k.startswith('module_list'):
state_dict[k[7:]] = state_dict_[k]
else:
state_dict[k] = state_dict_[k]
model_state_dict = model.state_dict()
# check loaded parameters and created model parameters
msg = 'If you see this, your model does not fully load the ' + \
'pre-trained weight. Please make sure ' + \
'you have correctly specified --arch xxx ' + \
'or set the correct --num_classes for your own dataset.'
for k in state_dict:
if k in model_state_dict:
if state_dict[k].shape != model_state_dict[k].shape:
print('Skip loading parameter {}, required shape{}, '\
'loaded shape{}. {}'.format(
k, model_state_dict[k].shape, state_dict[k].shape, msg))
state_dict[k] = model_state_dict[k]
else:
print('Drop parameter {}.'.format(k) + msg)
for k in model_state_dict:
if not (k in state_dict):
print('No param {}.'.format(k) + msg)
state_dict[k] = model_state_dict[k]
model.load_state_dict(state_dict, strict=False)
# resume optimizer parameters
if optimizer is not None and resume:
if 'optimizer' in checkpoint:
optimizer.load_state_dict(checkpoint['optimizer'])
start_epoch = checkpoint['epoch']
start_lr = lr
for step in lr_step:
if start_epoch >= step:
start_lr *= 0.1
for param_group in optimizer.param_groups:
param_group['lr'] = start_lr
print('Resumed optimizer with start lr', start_lr)
else:
print('No optimizer parameters in checkpoint.')
if optimizer is not None:
return model, optimizer, start_epoch
else:
return model
Other matters needing attention
- Don't use too large linear layers , because
nn.Linear(M,N)It uses O(mn) Of memory , The linear layer is too large and can easily exceed the existing video memory - Don't use... On too long sequences RNN, because RNN Back propagation uses BPTT Algorithm , The memory required is linear with the length of the input sequence
- model(x) Pre use model.train() and model.eval() Switch network status
- Don't use the code to calculate the gradient with torch.no_grad() Include
- model.eval() and torch.no_grad() The difference is that ,**model.eval()** It is to switch the network to the test state , for example BN and dropout Use different calculation methods during training and testing .torch.no_grad() It's closing PyTorch The automatic derivation mechanism of tensor , To reduce storage usage and accelerate Computing , The result obtained cannot be carried out loss.backward()
- model.zero_grad() Will return the gradient of the parameters of the whole model to zero , and optimizer.zero_grad() Only the gradient of the parameters passed in will be reset to zero
- loss.backward() Pre use optimizer.zero_grad() Clear cumulative gradient
- torch.utils.data.DataLoader Try to set pin_memory=True, For very small data sets MNIST Set up pin_memory=False It's a little faster .num_workers We need to find the fastest value in the experiment
- use del Delete unused intermediate variables in time , save GPU Storage
- Use inplace Operation can save GPU Storage
- Reduce CPU and GPU Data transfer between
- Use semi precision floating point numbers half() There will be a certain speed increase , Specific efficiency depends on GPU model , It is necessary to be careful of the stability problems caused by low numerical accuracy
- Use... Often assert tensor.size() == (N,D,H,W) As a means of debugging , Ensure that the tensor dimension is consistent with the assumption
- Except for tags y Outside , Try to use less one-dimensional tensor , Use n*1 Two dimensional tensor substitution of
- It takes time to count all parts of the code
with torch.autograd.profiler.profile(enabled=True, use_cuda=False) as profile:
...
print(profile)
# Or run... On the command line
python -m torch.utils.bottleneck main.py
- Use TorchSnooper To debug PyTorch Code , When the program is executing , It will automatically print Show the execution result of each line tensor shape , data type , equipment , Whether gradient information is needed
# pip install torchsnooper
import torchsnooper
# For the function , Use decorators
@torchsnooper.snoop()
# If it's not a function , Use with Statement to activate TorchSnooper, Put the training cycle into with Go in the statement .
with torchsnooper.snoop():
The original code
版权声明
本文为[DCGJ666]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230611343581.html
边栏推荐
- AUTOSAR从入门到精通100讲(八十七)-高级EEA的关键利器-AUTOSAR与DDS
- Warning "force fallback to CPU execution for node: gather_191" in onnxruntime GPU 1.7
- ECDSA 签名验证原理及C语言实现
- Systrace parsing
- Solution to slow compilation speed of Xcode
- RISCV MMU 概述
- PyTorch 18. torch.backends.cudnn
- 【Tensorflow】共享机制
- CMSIS CM3源码注解
- Are realrange and einsum really elegant
猜你喜欢
随机推荐
Common regular expressions
PyTorch 22. PyTorch常用代码段合集
直观理解 torch.nn.Unfold
画 ArcFace 中的 margin 曲线
【51单片机交通灯仿真】
《Multi-modal Visual Tracking:Review and Experimental Comparison》翻译
Modifying a column with the 'identity' pattern is not supported
Gather, unsqueeze and other operators when PTH is converted to onnx
PyTorch 9. optimizer
Detailed explanation of device tree
初探智能指针之std::shared_ptr、std::unique_ptr
enforce fail at inline_ container. cc:222
【点云系列】 场景识别类导读
Pep517 error during pycuda installation
面试总结之特征工程
PyTorch 19. Differences and relations of similar operations in pytorch
带您遨游太空,美摄科技为航天创意小程序提供全面技术支持
带低压报警的51单片机太阳能充电宝设计与制作(完整代码资料)
使用proteus仿真STM32超声波SRF04测距!Code+Proteus
torch.where能否传递梯度








