当前位置:网站首页>Detailed explanation of grid_sample function in frame insertion
Detailed explanation of grid_sample function in frame insertion
2022-08-07 12:10:00 【It's Twilight】
从之前VSR到后来做MEMC,Basically use this function,但是VSRA lot of work was abandoned in the later periodwarp操作,Therefore, no in-depth research.但是MEMC是必须用的,Otherwise, it will be directly generated end-to-end with a super large network.认准原创https://blog.csdn.net/longshaonihaoa/article/details/125964061
MEMC系列文章:
Motion Estimation Motion Compensation(Motion estimation and motion compensation,MEMC)入门总结
深度学习MEMCList of inset paperspaper list
Optical flow estimationcost volume详解
insetgrid_sample函数详解
1、grid_sampleBasic function explanation
官方讲解:
https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html
函数原型:
torch.nn.functional.grid_sample(input, grid, mode='bilinear', padding_mode='zeros', align_corners=None)
参数选择:
The function has two inputs,Three optional parameters.
input:输入,原始图像.维度[B,3,H,W]
grid:映射表.维度[B,H,W,2],值归一化为[-1, 1]
mode: 插值模式,Optional bilinear‘bilinear’,最近邻‘nearest’.
padding_mode: Edge fill mode,Optional reflection‘reflection’,边缘‘border’,零‘zero’.
align_corners: 对齐模式,Whether to choose alignment.
函数功能:
First let's distinguish坐标和值的区别.比如一张图片,坐标refers to a location,如(2,3)is the first of the specified image2行第3list that position.值is the pixel value at this position.
对应到grid上,It will have two values at each coordinate,The corresponding coordinates are the mapped coordinates.所以gridThe last dimension of is 2,分别对应X,Y.这里XYThe value of is normalized to [-1,1],Be careful when applying,It is mapped to the original size in the internal implementation of the function.The following example is for illustrationgridWhen using non-normalized values.(Why unify,At first I thought it was too much,Recently I saw that graphics have a similar normalization,There should be the same principle?)when processing the input image,比如需要处理(2,3)这个坐标.那就查grid中坐标为(2,3)的值,假设为(3,3),Then put the original picture(2,3)value at this coordinate 赋给 输出(3,3)这个坐标.
参数介绍:
padding_mode:当gridThe value exceeds the width and height bounds,How to choose the value.
reflection: Use the value of the symmetry point about the boundary,until the coordinates fall within the bounds.
border:Replace with the value of the boundary
zeros:用0代替.
align_corner: Intrinsic parameters for bilinear interpolation,whether to it.
These two parameters are described in more detail in the code below.
2、ATen代码实现
完整的代码可参考官方实现
基本逻辑如下:
# Loop processing pixel by pixel
for (const auto h : c10::irange(out_H)) {
for (const auto w : c10::irange(out_W)) {
...
// Process the coordinates,This function will be discussed next
scalar_t ix = grid_sampler_compute_source_index(x, inp_W, padding_mode, align_corners);
scalar_t iy = grid_sampler_compute_source_index(y, inp_H, padding_mode, align_corners);
if (interpolation_mode == GridSamplerInterpolation::Bilinear) {
// Bilinear interpolation operation
...
}
else if (interpolation_mode == GridSamplerInterpolation::Nearest) {
// The nearest neighbor interpolation operation
int64_t ix_nearest = static_cast<int64_t>(std::nearbyint(ix));
int64_t iy_nearest = static_cast<int64_t>(std::nearbyint(iy));
...
}
In fact, the most important thing in the code to understand the function isgrid_sampler_compute_source_index 函数,Its code is visible官方地址
As you can see below it calls two functions,一个是unnormalize,One is to calculate the coordinates.
scalar_t grid_sampler_compute_source_index(...) {
coord = grid_sampler_unnormalize(coord, size, align_corners);
coord = compute_coordinates(coord, size, padding_mode, align_corners);
return coord;
}
unnormalize 实现如下.根据align_cornerThe settings get different operations.当align_corner为True时,原来的[-1,1]映射为[0, size - 1].False则将[-1, 1] to [-0.5, size - 0.5].具体代码如下
scalar_t grid_sampler_unnormalize(scalar_t coord, int size, bool align_corners) {
if (align_corners) {
// unnormalize coord from [-1, 1] to [0, size - 1]
return ((coord + 1.f) / 2) * (size - 1);
} else {
// unnormalize coord from [-1, 1] to [-0.5, size - 0.5]
return ((coord + 1.f) * size - 1) / 2;
}
}
注意align_cornerNot only used here.
Calculating coordinates is mostly about being rightpadding_mode的处理.You can mainly see the following part of the code:
scalar_t reflect_coordinates(scalar_t in, int twice_low, int twice_high) {
...
scalar_t min = static_cast<scalar_t>(twice_low) / 2;
scalar_t span = static_cast<scalar_t>(twice_high - twice_low) / 2;
in = ::fabs(in - min);
scalar_t extra = ::fmod(in, span);
int flips = static_cast<int>(::floor(in / span));
if (flips % 2 == 0) { // return略有修改,Because I think it's clearer this way
return min + extra;
} else {
return min + (span - extra);
}
}
3、CUDA实现
cudaThe official implementation of the kernel function is here这里.
感觉cudaClearer than written above,The difference is that there is no loop.因为cudaThe kernel function operates on a certain position.
4、注意点
gridThe given normalized coordinate value,而非偏移量.区别在于,Coordinate values are passed directlyunnormalize得到目标坐标.The offset needs to be added to the current coordinate to reach the target coordinate.
边栏推荐
- 动态内存管理
- Yiwei Lithium's first product equipped with self-developed 46 series large cylindrical battery system successfully rolled off the production line
- MySQL - usage of case when, longitudinal statistics - similar to pivot table
- UGUI Series - Implementing Hierarchical Menus (Unity3D)
- How to break the review bottleneck period?
- Community Marketing Monetization Model Revealed!How to use the community to sell your products?
- Unet training and deployment tests
- 基于Gin的Go语言项目网盘实战解读
- 心态的变化
- Unet训练和部署测试
猜你喜欢

The Three Musketeers of Mini Programs-wxml, wxss, js Preliminary Understanding

随着模型的复杂度增加,过拟合是怎么导致的?如何解决?

聊聊电源自动切换电路(常用自动切换电路总结)

【合肥工业大学】考研初试复试资料分享

CVPR2022Oral专题系列(三):图像增强主干网络MAXIM
![[Image fusion] Pixel-level image fusion based on matlab dual-tree complex wavelet transform [including Matlab source code 2024]](/img/04/c672856f951b21dd920d7238dce56c.png)
[Image fusion] Pixel-level image fusion based on matlab dual-tree complex wavelet transform [including Matlab source code 2024]

Question about #mysql#: Create the above table as shown, the error code is as follows SELECT * FROM studentwhere name LIKE can be%

什么是 Office Open XML 文件格式

4. 插件开发原理

基于Gin的Go语言项目网盘实战解读
随机推荐
谷粒商城--品牌管理(OSS、JSR303数据校验)
Swin_Unet & Trans_UNet & Unet & Deeplabv3网络训练结果对比
db_recovery_file_dest is same as db_create_file_dest
Mysql transaction details
如何在一个数组中找到三个和为定值的不重复元素? 双指针解决 leetcode 15.三数之和
Interesting feature: the CHECK constraint
Advanced coursework zabbix01
Unet training and deployment tests
CVPR2022Oral专题系列(三):图像增强主干网络MAXIM
我说MySQL联合索引遵循最左前缀匹配原则,面试官让我回去等通知
随着模型的复杂度增加,过拟合是怎么导致的?如何解决?
中国石油大学(北京)-《油藏工程》第二阶段在线作业
Mysql事务详解
[OFDM communication] Signal detection of OFDM system based on matlab deep learning [including Matlab source code issue 2023]
Linux - Install and start Redis on a Linux system
Leetcode LCP 40. 心算挑战(可以,已解决)
What is the Office Open XML file format
leetcode: 860. 柠檬水找零
聊聊电源自动切换电路(常用自动切换电路总结)
CVPR2022Oral专题系列(二):多帧图像合成与增强