当前位置：网站首页>Detailed explanation of grid_sample function in frame insertion

Detailed explanation of grid_sample function in frame insertion

2022-08-07 12:10:00 【It's Twilight】

从之前VSR到后来做MEMC,Basically use this function,但是VSRA lot of work was abandoned in the later periodwarp操作,Therefore, no in-depth research.但是MEMC是必须用的,Otherwise, it will be directly generated end-to-end with a super large network.认准原创https://blog.csdn.net/longshaonihaoa/article/details/125964061

MEMC系列文章：
Motion Estimation Motion Compensation（Motion estimation and motion compensation,MEMC）入门总结
 深度学习MEMCList of inset paperspaper list
Optical flow estimationcost volume详解
 insetgrid_sample函数详解

1、grid_sampleBasic function explanation

官方讲解：
https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html

函数原型：

torch.nn.functional.grid_sample(input, grid, mode='bilinear', padding_mode='zeros', align_corners=None)

参数选择：
The function has two inputs,Three optional parameters.
input：输入,原始图像.维度[B,3,H,W]
grid：映射表.维度[B,H,W,2],值归一化为[-1, 1]
mode: 插值模式,Optional bilinear‘bilinear’,最近邻‘nearest’.
padding_mode: Edge fill mode,Optional reflection‘reflection’,边缘‘border’,零‘zero’.
align_corners: 对齐模式,Whether to choose alignment.

函数功能：
First let's distinguish坐标和值的区别.比如一张图片,坐标refers to a location,如（2,3）is the first of the specified image2行第3list that position.值is the pixel value at this position.
对应到grid上,It will have two values at each coordinate,The corresponding coordinates are the mapped coordinates.所以gridThe last dimension of is 2,分别对应X,Y.这里XYThe value of is normalized to [-1,1],Be careful when applying,It is mapped to the original size in the internal implementation of the function.The following example is for illustrationgridWhen using non-normalized values.（Why unify,At first I thought it was too much,Recently I saw that graphics have a similar normalization,There should be the same principle？）when processing the input image,比如需要处理（2,3）这个坐标.那就查grid中坐标为（2,3）的值,假设为（3,3）,Then put the original picture（2,3）value at this coordinate 赋给输出（3,3）这个坐标.

参数介绍：
padding_mode：当gridThe value exceeds the width and height bounds,How to choose the value.
reflection: Use the value of the symmetry point about the boundary,until the coordinates fall within the bounds.
border：Replace with the value of the boundary
zeros：用0代替.

align_corner: Intrinsic parameters for bilinear interpolation,whether to it.
These two parameters are described in more detail in the code below.

2、ATen代码实现

完整的代码可参考官方实现

基本逻辑如下：

# Loop processing pixel by pixel
for (const auto h : c10::irange(out_H)) {
    for (const auto w : c10::irange(out_W)) {
    	...
    	// Process the coordinates,This function will be discussed next
    	scalar_t ix = grid_sampler_compute_source_index(x, inp_W, padding_mode, align_corners);
        scalar_t iy = grid_sampler_compute_source_index(y, inp_H, padding_mode, align_corners);
        if (interpolation_mode == GridSamplerInterpolation::Bilinear) {
            // Bilinear interpolation operation
            ... 
            }
        else if (interpolation_mode == GridSamplerInterpolation::Nearest) {
            // The nearest neighbor interpolation operation
            int64_t ix_nearest = static_cast<int64_t>(std::nearbyint(ix));
            int64_t iy_nearest = static_cast<int64_t>(std::nearbyint(iy));
            ...
            }

In fact, the most important thing in the code to understand the function isgrid_sampler_compute_source_index 函数,Its code is visible官方地址

As you can see below it calls two functions,一个是unnormalize,One is to calculate the coordinates.

scalar_t grid_sampler_compute_source_index(...) {
    
  coord = grid_sampler_unnormalize(coord, size, align_corners);
  coord = compute_coordinates(coord, size, padding_mode, align_corners);
  return coord;
}

unnormalize 实现如下.根据align_cornerThe settings get different operations.当align_corner为True时,原来的[-1,1]映射为[0, size - 1].False则将[-1, 1] to [-0.5, size - 0.5].具体代码如下

scalar_t grid_sampler_unnormalize(scalar_t coord, int size, bool align_corners) {
  if (align_corners) {
    // unnormalize coord from [-1, 1] to [0, size - 1]
    return ((coord + 1.f) / 2) * (size - 1);
  } else {
    // unnormalize coord from [-1, 1] to [-0.5, size - 0.5]
    return ((coord + 1.f) * size - 1) / 2;
  }
}

注意align_cornerNot only used here.
Calculating coordinates is mostly about being rightpadding_mode的处理.You can mainly see the following part of the code：

scalar_t reflect_coordinates(scalar_t in, int twice_low, int twice_high) {
  ...
  scalar_t min = static_cast<scalar_t>(twice_low) / 2;
  scalar_t span = static_cast<scalar_t>(twice_high - twice_low) / 2;
  in = ::fabs(in - min);
  scalar_t extra = ::fmod(in, span);
  int flips = static_cast<int>(::floor(in / span));
  if (flips % 2 == 0) {    // return略有修改,Because I think it's clearer this way
    return min + extra;
  } else {
    return min + （span - extra）;
  }
}

3、CUDA实现

cudaThe official implementation of the kernel function is here这里.
感觉cudaClearer than written above,The difference is that there is no loop.因为cudaThe kernel function operates on a certain position.

4、注意点

gridThe given normalized coordinate value,而非偏移量.区别在于,Coordinate values are passed directlyunnormalize得到目标坐标.The offset needs to be added to the current coordinate to reach the target coordinate.

原网站

版权声明
本文为[It's Twilight]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/219/202208071204405748.html