当前位置:网站首页>Spatial Pyramid Pooling -Spatial Pyramid Pooling (including source code)
Spatial Pyramid Pooling -Spatial Pyramid Pooling (including source code)
2022-08-11 07:13:00 【KPer_Yang】
Table of Contents
1. Problems solved by Spatial Pyramid Pooling
2. SpatialPyramid Pooling Implementation Principle
3. Code implementation of Spatial Pyramid Pooling
Reference:
"Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition"
1,Problems solved by Spatial Pyramid Pooling
Spatial pyramid pooling is mainly used to solve the problem of inconsistent resolution of input images.Previously, image scaling or cropping was used to resolve image resolution inconsistencies, which could easily lead to loss of image information.The difference between the two methods to solve the problem of inconsistent image resolution is shown in Figure 1.1:
Figure 1.1 The difference between cropping, scaling and Spatial Pyramid Pooling
2. Principle of Spatial Pyramid Pooling
As shown in Figure 2.1, the implementation of SPP-Net is to pool the feature maps by a variety of pooling layers of different sizes, and then perform vector flattening and splicing.The pooling layers of 16*16, 4*4, and 1*1 are used in this article. When applied to your own tasks, you can change them according to factors such as the size of the feature map.At the same time, when the feature map is not equal in length and width, a padding operation is required, and 16*16 and 4*4 are pooled according to the method of dividing the grid, which is different from the operation of the ordinary pooling layer.
Figure 2.1 The principle diagram of Spatial Pyramid Pooling implementation
3,Code implementation of Spatial Pyramid Pooling
import mathdef spatial_pyramid_pool(self, previous_conv, num_sample, previous_conv_size, out_pool_size):'''previous_conv: a tensor vector of previous convolution layernum_sample: an int number of image in the batchprevious_conv_size: an int vector [height, width] of the matrix features size of previous convolution layerout_pool_size: a int vector of expected output size of max pooling layerreturns: a tensor vector with shape [1 x n] is the concentration of multi-level pooling'''# print(previous_conv.size())for i in range(len(out_pool_size)):# print(previous_conv_size)h_wid = int(math.ceil(previous_conv_size[0] / out_pool_size[i]))w_wid = int(math.ceil(previous_conv_size[1] / out_pool_size[i]))h_pad = (h_wid*out_pool_size[i] - previous_conv_size[0] + 1)/2w_pad = (w_wid*out_pool_size[i] - previous_conv_size[1] + 1)/2maxpool = nn.MaxPool2d((h_wid, w_wid), stride=(h_wid, w_wid), padding=(h_pad, w_pad))x = maxpool(previous_conv)if(i == 0):spp = x.view(num_sample,-1)# print("spp size:",spp.size())else:# print("size:",spp.size())spp = torch.cat((spp,x.view(num_sample,-1)), 1)return
边栏推荐
猜你喜欢
随机推荐
抖音获取douyin分享口令url API 返回值说明
Raspberry Pi set static IP address
【LeetCode】306.累加数(思路+题解)
概念名词解释
The ramdisk practice 1: the root file system integrated into the kernel
每日sql-找到每个学校gpa最低的同学(开窗)
华为防火墙-7-dhcp
iptables 使用脚本来管理规则
矩阵分析——Jordan标准形
numpy和tensor增加或删除一个维度
uboot sets the default bootdelay
八股文之redis
arcgis填坑_3
《Show, Attend and Tell: Neural Image Caption Generation with Visual Attention》论文阅读(详细)
HCIP BGP建邻、联邦、汇总实验
OA Project Pending Meeting & History Meeting & All Meetings
求过去半年内连续30天以上每天都有1000元以上成交的商铺
grep、sed、awk
MySQL导入导出&视图&索引&执行计划
window10吐槽