FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Last update: Nov 29, 2022

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

Details (Test device: nvidia RTX 2080ti)

Methods	backbone	FPS	mAP(%)
ReDet	ReR50	8.8	76.25
S²ANet	Mobilenet v2	18.9	67.46
S²ANet	R50	14.4	74.14
R³Det	R50	9.2	71.9
Oriented-RCNN	Mobilenet v2	21.2	72.72
Oriented-RCNN	R50	13.8	75.87
Oriented-RCNN	R101	11.3	76.28
RetinaNet-O	Mobilenet v2	22.4	67.95
RetinaNet-O	R50	16.5	72.7
RetinaNet-O	R101	13.3	73.7
Faster-RCNN-O	Mobilenet v2	23	67.41
Faster-RCNN-O	R50	14.4	72.29
Faster-RCNN-O	R101	11.4	72.65
FCOSR-S	Mobilenet v2	23.7	74.05
FCOSR-M	Rx50	14.6	77.15
FCOSR-L	Rx101	7.9	77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model	backbone	MS	Sched.	Param.	Input	GFLOPs	FPS	mAP	download
FCOSR-S	Mobilenet v2	-	3x	7.32M	1024×1024	101.42	23.7	74.05	model/cfg
FCOSR-S	Mobilenet v2	✓	3x	7.32M	1024×1024	101.42	23.7	76.11	model/cfg
FCOSR-M	ResNext50-32x4	-	3x	31.4M	1024×1024	210.01	14.6	77.15	model/cfg
FCOSR-M	ResNext50-32x4	✓	3x	31.4M	1024×1024	210.01	14.6	79.25	model/cfg
FCOSR-L	ResNext101-64x4	-	3x	89.64M	1024×1024	445.75	7.9	77.39	model/cfg
FCOSR-L	ResNext101-64x4	✓	3x	89.64M	1024×1024	445.75	7.9	78.80	model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model	backbone	MS	Sched.	Param.	Input	GFLOPs	FPS	mAP	download
FCOSR-S	Mobilenet v2	-	3x	7.32M	1024×1024	101.42	23.7	66.37	model/cfg
FCOSR-S	Mobilenet v2	✓	3x	7.32M	1024×1024	101.42	23.7	73.14	model/cfg
FCOSR-M	ResNext50-32x4	-	3x	31.4M	1024×1024	210.01	14.6	68.74	model/cfg
FCOSR-M	ResNext50-32x4	✓	3x	31.4M	1024×1024	210.01	14.6	73.79	model/cfg
FCOSR-L	ResNext101-64x4	-	3x	89.64M	1024×1024	445.75	7.9	69.96	model/cfg
FCOSR-L	ResNext101-64x4	✓	3x	89.64M	1024×1024	445.75	7.9	75.41	model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model	backbone	Rot.	Sched.	Param.	Input	GFLOPs	FPS	AP50(07)	AP75(07)	AP50(12)	AP75(12)	download
FCOSR-S	Mobilenet v2	✓	40k iters	7.29M	800×800	61.57	35.3	90.08	76.75	92.67	75.73	model/cfg
FCOSR-M	ResNext50-32x4	✓	40k iters	31.37M	800×800	127.87	26.9	90.15	78.58	94.84	81.38	model/cfg
FCOSR-L	ResNext101-64x4	✓	40k iters	89.61M	800×800	271.75	15.1	90.14	77.98	95.74	80.94	model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model	backbone	Head channels	Sched.	Param	Size	Input	GFLOPs	FPS	mAP	onnx	TensorRT
FCOSR-lite	Mobilenet v2	256	3x	6.9M	51.63MB	1024×1024	101.25	7.64	74.30	onnx	trt
FCOSR-tiny	Mobilenet v2	128	3x	3.52M	23.2MB	1024×1024	35.89	10.68	73.93	onnx	trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name	size	patch size	gap	patches	det objects	det time(s)
P0031.png	5343×3795	1024	200	35	1197	2.75
P0051.png	4672×5430	1024	200	42	309	2.38
P0112.png	6989×4516	1024	200	54	184	3.02
P0137.png	5276×4308	1024	200	35	66	1.95
P1004.png	7001×3907	1024	200	45	183	2.52
P1125.png	7582×4333	1024	200	54	28	2.95
P1129.png	4093×6529	1024	200	40	70	2.23
P1146.png	5231×4616	1024	200	42	64	2.29
P1157.png	7278×5286	1024	200	63	184	3.47
P1378.png	5445×4561	1024	200	42	83	2.32
P1379.png	4426×4182	1024	200	30	686	1.78
P1393.png	6072×6540	1024	200	64	893	3.63
P1400.png	6471×4479	1024	200	48	348	2.63
P1402.png	4112×4793	1024	200	30	293	1.68
P1406.png	6531×4182	1024	200	40	19	2.19
P1415.png	4894x4898	1024	200	36	190	1.99
P1436.png	5136×5156	1024	200	42	39	2.31
P1448.png	7242×5678	1024	200	63	51	3.41
P1457.png	5193×4658	1024	200	42	382	2.33
P1461.png	6661×6308	1024	200	64	27	3.45
P1494.png	4782×6677	1024	200	48	70	2.61
P1500.png	4769×4386	1024	200	36	92	1.96
P1772.png	5963×5553	1024	200	49	28	2.70
P1774.png	5352×4281	1024	200	35	291	1.95
P1796.png	5870×5822	1024	200	49	308	2.74
P1870.png	5942×6059	1024	200	56	135	3.04
P2043.png	4165×3438	1024	200	20	1479	1.49
P2329.png	7950×4334	1024	200	60	83	3.26
P2641.png	7574×5625	1024	200	63	269	3.41
P2642.png	7039×5551	1024	200	63	451	3.50
P2643.png	7568×5619	1024	200	63	249	3.40
P2645.png	4605×3442	1024	200	24	357	1.42
P2762.png	8074×4359	1024	200	60	127	3.23
P2795.png	4495×3981	1024	200	30	65	1.64

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Related tags

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Install

Getting Started

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

FCOSR serise HRSC2016 result. FPS(2080ti)

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

Owner

Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Non-Vacuous Generalisation Bounds for Shallow Neural Networks

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Ranking Models in Unlabeled New Environments （iccv21）

Torchyolo - Yolov3 ve Yolov4 modellerin Pytorch uygulamasıdır

ParmeSan: Sanitizer-guided Greybox Fuzzing

Yolov5 deepsort inference，使用YOLOv5+Deepsort实现车辆行人追踪和计数，代码封装成一个Detector类，更容易嵌入到自己的项目中

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

A Deep Learning Framework for Neural Derivative Hedging

This repository contains PyTorch code for Robust Vision Transformers.

Official implementation of the paper Do pedestrians pay attention? Eye contact detection for autonomous driving

[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition

Gin provides a lightweight configuration framework for Python

Rendering color and depth images for ShapeNet models.

This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

N-gram models- Unsmoothed, Laplace, Deleted Interpolation