当前位置:网站首页>Two Stage Detection
Two Stage Detection
2022-04-23 21:00:00 【Top of the program】
utilize selective search Get an approximation 2000 individual RoI Area
Deep ConvNet
The original whole picture is convoluted onceRoI projection
original ROI After convolution feature map Mapping , According to the size of the original picture and the size of the whole picture feature map The size can be scaled to a certain scale . because ROI Dimensions are scaled , There may be non integer cases , This requires rounding down , Cause pixel offset . This is the first quantization error .RoI pooling layer
Will be different sizes of feature map Become the same size .
take feature map Dage , such as 7x7 The size of the grid , Then proceed max pooling, take feature map Reduce to a certain size . Or by transpose conv( Transposition convolution ), Bilinear interpolation (upsample) Enlarge the feature map . Similarly, there is also the operation of grid rounding , Cause pixel offset . This is the second quantization error .
The offset on the feature map , If you map it to the original picture , Then it will lead to the final prediction Bounding Box There will be a greater offset on the original map , So it's usually used ROI pooling The algorithm of , For small targets, the effect is not very good .The image after transposition and convolution has checkerboard effect , When the picture is enlarged , The picture is similar to the checkerboard . In general , People use simple upsample To improve from small to large .
FCs
Fully connected layersoftmax, bbox regressor
Classification and location regression
because RoI pooling Twice quantization error of ,HeKaimin Put forward ROI Align
ROI Align The method used is , In accordance with the mapping scale from the original large picture feature map Get on ROI feature Of map after , Even if feature map There are decimal points in the pixel coordinates in , No rounding operation , Then make the green grid into NxN( The original paper is 2x2) Small black , Similarly, even if the small black box exists, there are decimals , No rounding operation . Instead, the value of the color point in the center of the small black grid is calculated by bilinear interpolation , And then again maxpolling The four color dots in the whole green box get the pixel value instead of the green grid as the reduced feature map.
ROI Align It solves the position drift caused by twice quantization , But the introduction of super parameters N,N Different sizes of , Some pixels may not be utilized , At the same time, the pixels at the edge of the red box may not be utilized .
2018 year Put forward Precise ROI Pooling:
Precise ROI Pooling [2018, IoU-Net]
First, on the basis of the red box , Hit the grid to get a green box , For each green box inside , The red dot is obtained through blue dot and double line interpolation , Finally, sum the red dots and divide by the total number of pixels , Get the pixel feature representing the green box .
This is what's on the green box average pooling, The green box is a decimal box , It contains decimal pixels . So you need to generate red dots , The red dot is just inside the green box , Is the number of integers . Red dots need to be evenly spaced , For example, the pixels on the red edge are 5.87 Pixel , Uniformly obtained 6 A little bit , that 5.87/5 Is the interval of each point .

Faster R-CNN In addition to the Selective search, use Region proposal network Replaced the , The main generation is Region proposal.
It can be simplified to the following figure 
- Backbone
- RPN RPN It's mainly about generating Region Proposal, Training is needed ,RPN The introduction of , bring fast rcnn Truly realize the end-to-end network
- Fast RCNN: ROI + 𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧,𝐑𝐞𝐠𝐫𝐞𝐬𝐬𝐢𝐨n
The algorithm has no standard answer , The details should not be tied up , The main idea of algorithm is
The steps of learning algorithms
- Through the discussion of literariness in words
- Text description becomes mathematical language
- Code Mathematics
The Internet is building blocks , Simple functional programming can switch between different building blocks .
The main composition and structure are as follows :
Backbone Feature extraction network , This part can be used VGG, Resnet, DenseNet,Unet And other basic networks . Basic components
neck/link: The essential , It can be used 1x1 Convolution kernel , Can be replaced by inception module,bottol neck module And so on. .
head: functional head, It's usually fc, Convolution operation, etc
Backbone
RPN
The main purpose is to generate Region proposal
- The so-called two stages come from RPN+Bbox Regression
- This is also why the two-stage detection effect is better than the one-stage detection effect ( Note that this conclusion is wrong )
RPN The structure is as follows , Go first 3x3 And twice 1x1 Convolution kernel ,1x1, Once, the number of channels was changed to 18 Form one output , It is mainly used for classification , In addition, the number of channels is changed to 36, Mainly used to do Bbox forecast
版权声明
本文为[Top of the program]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/111/202204210545091126.html
边栏推荐
- 小米手机全球已舍弃“MI”品牌,全面改用“xiaomi”全称品牌
- [SQL] string series 2: split a string into multiple lines according to specific characters
- Introduce structured concurrency and release swift 5.5!
- Go限制深度遍历目录下文件
- 1.整理华子面经--1
- setInterval、setTimeout、requestAnimationFrame
- Unity animation creates sequence frame code and generates animationclip
- Crisis is opportunity. Why will the efficiency of telecommuting improve?
- thinkphp5+数据大屏展示效果
- Presto on spark supports 3.1.3 records
猜你喜欢

小米手机全球已舍弃“MI”品牌,全面改用“xiaomi”全称品牌

Unity animation creates sequence frame code and generates animationclip

MySQL进阶之表的增删改查

go defer

Deno 1.13.2 发布

Identifier CV is not defined in opencv4_ CAP_ PROP_ FPS; CV_ CAP_ PROP_ FRAME_ COUNT; CV_ CAP_ PROP_ POS_ Frames problem

Question brushing plan -- backtracking method (I)

Summary and effect analysis of methods for calculating binocular parallax

Express ③ (use express to write interface and cross domain related issues)

The more you use the computer, the slower it will be? Recovery method of file accidental deletion
随机推荐
Fastdfs思维导图
2. Finishing huazi Mianjing -- 2
go defer
MySQL基础合集
Chrome 94 引入具有争议的 Idle Detection API,苹果和Mozilla反对
The more you use the computer, the slower it will be? Recovery method of file accidental deletion
Create vs project with MATLAB
Gsi-ecm digital platform for engineering construction management
3-5通过XSS获取cookie以及XSS后台管理系统的使用
Problem brushing plan -- dynamic programming (IV)
Matlab matrix index problem
41. 缺失的第一个正数
Explore ASP Net core read request The correct way of body
Tensorflow and pytorch middle note feature map size adjustment to achieve up sampling
Unity animation creates sequence frame code and generates animationclip
Graph traversal - BFS, DFS
laravel 发送邮件
Assertionerror: invalid device ID and runtimeerror: CUDA error: invalid device ordinal
Amazon and epic will be settled, and the Microsoft application mall will be opened to third parties
常用60类图表使用场景、制作工具推荐
