当前位置:网站首页>Deep learning object detection
Deep learning object detection
2022-04-23 05:30:00 【Mikawa】
deep learning object detection
Paper list from 2014 to 2019

Milestones


Object detector composed parts
- Input: Image, Patches, Image Pyramid
- Backbones: VGG16, ResNet-50, SpineNet, EfficientNet-B0/B7 , CSPResNeXt50, CSPDarknet53
- Neck:
- Additional blocks: SPP, ASPP, RFB, SAM
- Path-aggregation blocks: FPN, PAN, NAS-FPN, Fully-connected FPN, BiFPN, ASFF, SFAM
- Heads:
- Dense Prediction (one-stage):
- RPN, SSD, YOLO, RetinaNet (anchor based)
- CornerNet, CenterNet, MatrixNet, FCOS (anchor free)
- Sparse Prediction (two-stage):
- Faster R-CNN, R-FCN, Mask R-CNN (anchor based)
- RepPoints (anchor free)
- Dense Prediction (one-stage):
Detection methods category

Object detection steps
One-Stage
-
Extracts feature on all area of image, classify the objects,
localize bounding-box
Two-Stage
-
Generates category-independent region proposals,
extracts feature vector from each region proposal
-
Classify the objects, precisely bounding-box prediction (NMS)
Small object detection tricks
-
Framework for small object detection
- Multi-scale Feature Learning

-
Enhance the Receptive Fields (visual attention mechanisms)
-
Data Augmentation
- GAN-based Detection
- Flipping, cropping, rotating, scaling
-
Training Strategy
- Unsupervised object detection
- Weakly Supervised Object Detection
- Multi-Scale Training/Val/Test
- GPU accelerate
-
Context-based Detection
- Local context
- Global context
- Context interactive
-
Neural Architecture Search
- Stacking more pyramid networks
- Adding feature dimension
- Adopting high capacity architecture
-
Efficient post-processing methods
- Non maximum suppression (NMS)
- Soft-NMS
-
Deformable convolutional networks
-
Multi-task joint learning and optimization
- Object detection
- Semantic segmentation
- Instance segmentation
- Edge detection
- Highlight detection
-
Establish small object datasets
Performance table
FPS(Speed) index is related to the hardware spec(e.g. CPU, GPU, RAM, etc), so it is hard to make an equal comparison. The solution is to measure the performance of all models on hardware with equivalent specifications, but it is very difficult and time consuming.
| Detector | COCO (mAP@IoU=0.5:0.95) | Published In |
|---|---|---|
| R-CNN | - | CVPR’14 |
| Fast R-CNN | 19.7 | ICCV’15 |
| Faster R-CNN | 21.9 | NIPS’15 |
| YOLO v1 | - | CVPR’16 |
| SSD | 31.2 | ECCV’16 |
| R-FCN | 29.9 | NIPS’16 |
| FPN | 36.2 | CVPR’17 |
| YOLO v2 | - | CVPR’17 |
| RetinaNet | 39.1 | ICCV’17 |
| Mask R-CNN | 39.8 | ICCV’17 |
| Soft-NMS | 40.9 | ICCV’17 |
| YOLO v3 | 33.0 | arXiv’18 |
| RefineDet | 41.8 | CVPR’18 |
| Cascade R-CNN | 42.8 | CVPR’ 18 |
| RFBNet | - | ECCV’18 |
| Softer-NMS | - | arXiv’ 18 |
| SNIPER | 43.5 | NIPS’ 18 |
| M2Det | 44.2 | AAAI’19 |
| Libra R-CNN | 43.0 | CVPR’19 |
| FSAF | 44.6 | CVPR’19 |
| ExtremeNet | 43.7 | CVPR’19 |
| CenterNet | 45.1 | ICCV’19 |
| FreeAnchor | 44.8 | NeurIPS’19 |
| CBNet | 53.3 | AAAI’20 |
| YOLOv4 | - | arXiv’20 |
| ATSS | 50.7 | CVPR’ 20 |
| Hit-Detector | 41.4 | CVPR’ 20 |
| DetectoRS | 54.7 | arXiv’20 |
Performance on MS COCO



MS COCO detection evaluation metrics

2014
- [R-CNN] Rich feature hierarchies for accurate object detection and semantic segmentation | [CVPR’ 14] |
[pdf][official code - caffe]CNN
2015
-
[Fast R-CNN] Fast R-CNN | [ICCV’ 15] |
[pdf][official code - caffe]RoI -
[Faster R-CNN, RPN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | [NIPS’ 15] |
[pdf][official code - caffe][unofficial code - tensorflow][unofficial code - pytorch]Region Proposal Network (RPN)NMS
2016
-
[YOLO v1] You Only Look Once: Unified, Real-Time Object Detection | [CVPR’ 16] |
[pdf][official code - c]One-stage -
[SSD] SSD: Single Shot MultiBox Detector | [ECCV’ 16] |
[pdf][official code - caffe][unofficial code - tensorflow][unofficial code - pytorch]Multi-scale feature mapVGG16NMS -
[R-FCN] R-FCN: Object Detection via Region-based Fully Convolutional Networks | [NIPS’ 16] |
[pdf][official code - caffe][unofficial code - caffe]
2017
-
[FPN] Feature Pyramid Networks for Object Detection | [CVPR’ 17] |
[pdf][unofficial code - caffe]Feature Pyramid Networks -
[YOLO v2] YOLO9000: Better, Faster, Stronger | [CVPR’ 17] |
[pdf][official code - c][unofficial code - caffe][unofficial code - tensorflow][unofficial code - tensorflow][unofficial code - pytorch] -
[RetinaNet] Focal Loss for Dense Object Detection | [ICCV’ 17] |
[pdf][official code - keras][unofficial code - pytorch][unofficial code - mxnet][unofficial code - tensorflow]Focal Loss -
[Mask R-CNN] Mask R-CNN | [ICCV’ 17] |
[pdf][official code - caffe2][unofficial code - tensorflow][unofficial code - tensorflow][unofficial code - pytorch] -
[Soft-NMS] Improving Object Detection With One Line of Code | [ICCV’ 17] |
[pdf][official code - caffe]Soft-NMS
2018
-
[YOLO v3] YOLOv3: An Incremental Improvement | [arXiv’ 18] |
[pdf][official code - c][unofficial code - pytorch][unofficial code - pytorch][unofficial code - keras][unofficial code - tensorflow] -
[RefineDet] Single-Shot Refinement Neural Network for Object Detection | [CVPR’ 18] |
[pdf][official code - caffe][unofficial code - chainer][unofficial code - pytorch]Combine one-stage and two-stage -
[Cascade R-CNN] Cascade R-CNN: Delving into High Quality Object Detection | [CVPR’ 18] |
[pdf][official code - caffe]Training Strategy -
[RFBNet] Receptive Field Block Net for Accurate and Fast Object Detection | [ECCV’ 18] |
[pdf][official code - pytorch]Enhance the Receptive Fields -
[Softer-NMS] Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection | [arXiv’ 18] |
[pdf]Soft-NMS -
[SNIPER] SNIPER: Efficient Multi-Scale Training | [NIPS’ 18] |
[pdf]Training Strategy
2019
-
[M2Det] M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network | [AAAI’ 19] |
[pdf][official code - pytorch]Multi-scale Feature Learning -
[Libra R-CNN] Libra R-CNN: Balanced Learning for Object Detection | [CVPR’ 19] |
[pdf]Training Strategy -
[FSAF] Feature Selective Anchor-Free Module for Single-Shot Object Detection | [CVPR’ 19] |
[pdf]Anchor-Free -
[ExtremeNet] Bottom-up Object Detection by Grouping Extreme and Center Points | [CVPR’ 19] |
[pdf]|[official code - pytorch]Instance Segmentation -
[CenterNet] CenterNet: Keypoint Triplets for Object Detection | [ICCV’ 19] |
[pdf]Keypoint-based detector -
[FreeAnchor] FreeAnchor: Learning to Match Anchors for Visual Object Detection | [NeurIPS’ 19] |
[pdf]Anchor-Free
2020
- [CBnet] Cbnet: A novel composite backbone network architecture for object detection | [AAAI’ 20] |
[pdf]Composite Backbone Network - [YOLOv4] YOLOv4: Optimal Speed and Accuracy of Object Detection | [arXiv’ 20] |
[pdf]- Input: Mosaic data augmentation, Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT)
- BackBone: CSPDarknet53, Mish-activation, DropBlock regularization
- Neck: SPP block, PAN (path-aggregation block)
- Prediction: CIoU-loss, DIoU-NMS
- [ATSS] Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection | [CVPR’ 20] |
[pdf]Anchor-BasedTraining Strategy - [Hit-Detector] Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection | [CVPR’ 20] |
[pdf]Neural Architecture Search - [DetectoRS] DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution | [arXiv’ 20] |
[pdf]Recursive Feature PyramidSwitchable Atrous ConvolutionInstance Segmentation
Survey
- Recent advances in small object detection based on deep learning: A review
[pdf] - A Survey of Deep Learning-based Object Detection
[pdf] - Object Detection in 20 Y ears: A Survey
[pdf] - Recent Advances in Deep Learning for Object Detection
[pdf]
Analyze Tools
版权声明
本文为[Mikawa]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220543094190.html
边栏推荐
- Parsing of string class intern() method
- Uniapp wechat sharing
- Solve the problem of JS calculation accuracy
- Laravel routing job
- 2021-11-01
- Differences between auto and decltype inference methods (learning notes)
- what is wifi6?
- Excel 2016 打开文件第一次打不开,有时空白,有时很慢要打开第二次才行
- [no title] Click the classification jump page to display the details
- Ehcache Memcache redis three caches
猜你喜欢

2021-09-28

相机成像+单应性变换+相机标定+立体校正

How to add beautiful code blocks in word | a very complete method to sort out and compare

selenium预先加载cookie的必要性

After adding qmenu to qtoolbutton and QPushButton, remove the triangle icon in the lower right corner
Redis的基本知识

Anti crawler (0): are you still climbing naked with selenium? You're being watched! Crack webdriver anti crawler

Open source rule engine - Ice: dedicated to solving flexible and complex hard coding problems

Nécessité de précharger les cookies dans le sélénium

C# ,类库
随机推荐
Three methods of list rendering
巴普洛夫与兴趣爱好
Three of three JS (webgl) simple sorting of rotation attribute function, and a simple case of rotating around the axis based on this
相机成像+单应性变换+相机标定+立体校正
Quick app bottom navigation bar
領域驅動模型DDD(三)——使用Saga管理事務
open3d材质设置参数分析
MySQL series - install MySQL 5.6.27 on Linux and solve common problems
[no title] Click the classification jump page to display the details
If the route reports an error after deployment according to the framework project
[the background color changes after clicking a line]
Write the declaration of a function to return the reference of the array, and the array contains 10 string objects (notes)
If I am PM's performance, movie VR ticket purchase display
Blender programmed terrain production
Excel 2016 cannot open the file for the first time. Sometimes it is blank and sometimes it is very slow. You have to open it for the second time
Executable program execution process
IPI interrupt
日志简介和构建web应用
2021-09-28
Some pits used by uni