当前位置:网站首页>Deep learning object detection
Deep learning object detection
2022-04-23 05:30:00 【Mikawa】
deep learning object detection
Paper list from 2014 to 2019

Milestones


Object detector composed parts
- Input: Image, Patches, Image Pyramid
- Backbones: VGG16, ResNet-50, SpineNet, EfficientNet-B0/B7 , CSPResNeXt50, CSPDarknet53
- Neck:
- Additional blocks: SPP, ASPP, RFB, SAM
- Path-aggregation blocks: FPN, PAN, NAS-FPN, Fully-connected FPN, BiFPN, ASFF, SFAM
- Heads:
- Dense Prediction (one-stage):
- RPN, SSD, YOLO, RetinaNet (anchor based)
- CornerNet, CenterNet, MatrixNet, FCOS (anchor free)
- Sparse Prediction (two-stage):
- Faster R-CNN, R-FCN, Mask R-CNN (anchor based)
- RepPoints (anchor free)
- Dense Prediction (one-stage):
Detection methods category

Object detection steps
One-Stage
-
Extracts feature on all area of image, classify the objects,
localize bounding-box
Two-Stage
-
Generates category-independent region proposals,
extracts feature vector from each region proposal
-
Classify the objects, precisely bounding-box prediction (NMS)
Small object detection tricks
-
Framework for small object detection
- Multi-scale Feature Learning

-
Enhance the Receptive Fields (visual attention mechanisms)
-
Data Augmentation
- GAN-based Detection
- Flipping, cropping, rotating, scaling
-
Training Strategy
- Unsupervised object detection
- Weakly Supervised Object Detection
- Multi-Scale Training/Val/Test
- GPU accelerate
-
Context-based Detection
- Local context
- Global context
- Context interactive
-
Neural Architecture Search
- Stacking more pyramid networks
- Adding feature dimension
- Adopting high capacity architecture
-
Efficient post-processing methods
- Non maximum suppression (NMS)
- Soft-NMS
-
Deformable convolutional networks
-
Multi-task joint learning and optimization
- Object detection
- Semantic segmentation
- Instance segmentation
- Edge detection
- Highlight detection
-
Establish small object datasets
Performance table
FPS(Speed) index is related to the hardware spec(e.g. CPU, GPU, RAM, etc), so it is hard to make an equal comparison. The solution is to measure the performance of all models on hardware with equivalent specifications, but it is very difficult and time consuming.
| Detector | COCO (mAP@IoU=0.5:0.95) | Published In |
|---|---|---|
| R-CNN | - | CVPR’14 |
| Fast R-CNN | 19.7 | ICCV’15 |
| Faster R-CNN | 21.9 | NIPS’15 |
| YOLO v1 | - | CVPR’16 |
| SSD | 31.2 | ECCV’16 |
| R-FCN | 29.9 | NIPS’16 |
| FPN | 36.2 | CVPR’17 |
| YOLO v2 | - | CVPR’17 |
| RetinaNet | 39.1 | ICCV’17 |
| Mask R-CNN | 39.8 | ICCV’17 |
| Soft-NMS | 40.9 | ICCV’17 |
| YOLO v3 | 33.0 | arXiv’18 |
| RefineDet | 41.8 | CVPR’18 |
| Cascade R-CNN | 42.8 | CVPR’ 18 |
| RFBNet | - | ECCV’18 |
| Softer-NMS | - | arXiv’ 18 |
| SNIPER | 43.5 | NIPS’ 18 |
| M2Det | 44.2 | AAAI’19 |
| Libra R-CNN | 43.0 | CVPR’19 |
| FSAF | 44.6 | CVPR’19 |
| ExtremeNet | 43.7 | CVPR’19 |
| CenterNet | 45.1 | ICCV’19 |
| FreeAnchor | 44.8 | NeurIPS’19 |
| CBNet | 53.3 | AAAI’20 |
| YOLOv4 | - | arXiv’20 |
| ATSS | 50.7 | CVPR’ 20 |
| Hit-Detector | 41.4 | CVPR’ 20 |
| DetectoRS | 54.7 | arXiv’20 |
Performance on MS COCO



MS COCO detection evaluation metrics

2014
- [R-CNN] Rich feature hierarchies for accurate object detection and semantic segmentation | [CVPR’ 14] |
[pdf][official code - caffe]CNN
2015
-
[Fast R-CNN] Fast R-CNN | [ICCV’ 15] |
[pdf][official code - caffe]RoI -
[Faster R-CNN, RPN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | [NIPS’ 15] |
[pdf][official code - caffe][unofficial code - tensorflow][unofficial code - pytorch]Region Proposal Network (RPN)NMS
2016
-
[YOLO v1] You Only Look Once: Unified, Real-Time Object Detection | [CVPR’ 16] |
[pdf][official code - c]One-stage -
[SSD] SSD: Single Shot MultiBox Detector | [ECCV’ 16] |
[pdf][official code - caffe][unofficial code - tensorflow][unofficial code - pytorch]Multi-scale feature mapVGG16NMS -
[R-FCN] R-FCN: Object Detection via Region-based Fully Convolutional Networks | [NIPS’ 16] |
[pdf][official code - caffe][unofficial code - caffe]
2017
-
[FPN] Feature Pyramid Networks for Object Detection | [CVPR’ 17] |
[pdf][unofficial code - caffe]Feature Pyramid Networks -
[YOLO v2] YOLO9000: Better, Faster, Stronger | [CVPR’ 17] |
[pdf][official code - c][unofficial code - caffe][unofficial code - tensorflow][unofficial code - tensorflow][unofficial code - pytorch] -
[RetinaNet] Focal Loss for Dense Object Detection | [ICCV’ 17] |
[pdf][official code - keras][unofficial code - pytorch][unofficial code - mxnet][unofficial code - tensorflow]Focal Loss -
[Mask R-CNN] Mask R-CNN | [ICCV’ 17] |
[pdf][official code - caffe2][unofficial code - tensorflow][unofficial code - tensorflow][unofficial code - pytorch] -
[Soft-NMS] Improving Object Detection With One Line of Code | [ICCV’ 17] |
[pdf][official code - caffe]Soft-NMS
2018
-
[YOLO v3] YOLOv3: An Incremental Improvement | [arXiv’ 18] |
[pdf][official code - c][unofficial code - pytorch][unofficial code - pytorch][unofficial code - keras][unofficial code - tensorflow] -
[RefineDet] Single-Shot Refinement Neural Network for Object Detection | [CVPR’ 18] |
[pdf][official code - caffe][unofficial code - chainer][unofficial code - pytorch]Combine one-stage and two-stage -
[Cascade R-CNN] Cascade R-CNN: Delving into High Quality Object Detection | [CVPR’ 18] |
[pdf][official code - caffe]Training Strategy -
[RFBNet] Receptive Field Block Net for Accurate and Fast Object Detection | [ECCV’ 18] |
[pdf][official code - pytorch]Enhance the Receptive Fields -
[Softer-NMS] Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection | [arXiv’ 18] |
[pdf]Soft-NMS -
[SNIPER] SNIPER: Efficient Multi-Scale Training | [NIPS’ 18] |
[pdf]Training Strategy
2019
-
[M2Det] M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network | [AAAI’ 19] |
[pdf][official code - pytorch]Multi-scale Feature Learning -
[Libra R-CNN] Libra R-CNN: Balanced Learning for Object Detection | [CVPR’ 19] |
[pdf]Training Strategy -
[FSAF] Feature Selective Anchor-Free Module for Single-Shot Object Detection | [CVPR’ 19] |
[pdf]Anchor-Free -
[ExtremeNet] Bottom-up Object Detection by Grouping Extreme and Center Points | [CVPR’ 19] |
[pdf]|[official code - pytorch]Instance Segmentation -
[CenterNet] CenterNet: Keypoint Triplets for Object Detection | [ICCV’ 19] |
[pdf]Keypoint-based detector -
[FreeAnchor] FreeAnchor: Learning to Match Anchors for Visual Object Detection | [NeurIPS’ 19] |
[pdf]Anchor-Free
2020
- [CBnet] Cbnet: A novel composite backbone network architecture for object detection | [AAAI’ 20] |
[pdf]Composite Backbone Network - [YOLOv4] YOLOv4: Optimal Speed and Accuracy of Object Detection | [arXiv’ 20] |
[pdf]- Input: Mosaic data augmentation, Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT)
- BackBone: CSPDarknet53, Mish-activation, DropBlock regularization
- Neck: SPP block, PAN (path-aggregation block)
- Prediction: CIoU-loss, DIoU-NMS
- [ATSS] Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection | [CVPR’ 20] |
[pdf]Anchor-BasedTraining Strategy - [Hit-Detector] Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection | [CVPR’ 20] |
[pdf]Neural Architecture Search - [DetectoRS] DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution | [arXiv’ 20] |
[pdf]Recursive Feature PyramidSwitchable Atrous ConvolutionInstance Segmentation
Survey
- Recent advances in small object detection based on deep learning: A review
[pdf] - A Survey of Deep Learning-based Object Detection
[pdf] - Object Detection in 20 Y ears: A Survey
[pdf] - Recent Advances in Deep Learning for Object Detection
[pdf]
Analyze Tools
版权声明
本文为[Mikawa]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220543094190.html
边栏推荐
- [triangle Yang Hui triangle printing odd even cycle JS for break cycle]
- After NPM was upgraded, there was a lot of panic
- 2021-10-08
- Use pagoda + Xdebug + vscode to debug code remotely
- Uncle wolf is looking for a translator -- Plato -- ongoing translation
- Parsing of string class intern() method
- 如果我是pm之 演出电影vr购票展示
- 使用宝塔+xdebug+vscode远程调试代码
- Cross domain CORS relationship~
- selenium预先加载cookie的必要性
猜你喜欢

相机成像+单应性变换+相机标定+立体校正
![[the background color changes after clicking a line]](/img/3a/709d47fd3a370d86569fb9b560b403.png)
[the background color changes after clicking a line]

2021-11-01

Modèle axé sur le domaine DDD (III) - gestion des transactions à l'aide de Saga

Domain driven model DDD (III) -- using saga to manage transactions

Cross platform packaging of QT packaging program

Laravel routing job

Various situations of data / component binding

what is wifi6?

双击.jar包无法运行解决方法
随机推荐
巴普洛夫与兴趣爱好
Graphics. Fromimage reports an error "graphics object cannot be created from an image that has an indexed pixel..."
Generation of straightening body in 3D slicer
2021-10-12
QSslSocket::connectToHostEncrypted: TLS initialization failed
Wbpack configuring production development environment
MFC implementation resources are implemented separately by DLL
Processus d'exécution du programme exécutable
Interpretation of common SQL statements
Data bus realizes the communication between brother components
SQL Server检索SQL和用户信息的需求
Camera imaging + homography transformation + camera calibration + stereo correction
Why can't V-IF and V-for be used together
open3d材质设置参数分析
弘玑微课堂 | Cyclone RPA之“灵活的数字员工”执行器
Log introduction and building web application
egg测试的知识大全--mock、superTest、coffee
STD:: String implements split
JSON.
Phlli in a VM node