This repo is customed for VisDrone.

Last update: Jul 17, 2022

Overview

Object Detection for VisDrone(无人机航拍图像目标检测)

My environment

1、Windows10 (Linux available)
2、tensorflow >= 1.12.0
3、python3.6 (anaconda)
4、cv2
5、ensemble-boxes(pip install ensemble-boxes)

Datasets(XML format for training set)

(1).Datasets is available on https://github.com/VisDrone/VisDrone-Dataset
(2).Please download xml annotations on Baidu Yun (提取码: ia3f), or Google Drive, and configure it in ./core/config/cfgs.py
(3).You can also use ./data/visdrone2xml.py to generate your visdrone xml files, modify the path information.

training-set format:

├── VisDrone2019-DET-train
│     ├── Annotation(xml format)
│     ├── JPEGImages

Pretrained Models(ResNet50vd, 101vd)

Please download pretrained models on Baidu Yun (提取码: krce), or Google Drive, then put it into ./data/pretrained_weights

Train

Modify the parameters in ./core/config/cfgs.py
python train_step.py

Eval

Modify the parameters in ./core/config/cfgs.py
python eval_visdrone.py, it will get txt format file, then use official matlab tools to eval the final results.
python eval_model_ensemble.py. Before the running of this file, you should set NORMALIZED_RESULTS_FOR_MODEL_ENSEMBLE=True in cfgs.py and then run eval_visdrone.py to get normalized txt result.

Visualization

Modify the parameters in ./core/config/cfgs.py
python image_demo.py, it will get visualized results.

Visualized Result (multi-scale training+multi-scale testing)

Test Result(Validation set)：

1. ResNet50-vd

Name	maxDets	Result(s/m)
Average Precision (AP) @( IoU=0.50:0.95)	maxDets=500	31.26%/35.1%
Average Precision (AP) @( IoU=0.50 )	maxDets=500	56.44%/60.29%
Average Precision (AP) @( IoU=0.75 )	maxDets=500	30.13%/35.42%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 1	0.78%/0.58%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 10	6.62%/6.05%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=100	38.21%/40.99%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=500	48.41%/53%

"s" means single-scale training + single-scale testing; "m"means multi-scale training + multi-scale testing

2. ResNet101-vd

Name	maxDets	Result(s/m)
Average Precision (AP) @( IoU=0.50:0.95)	maxDets=500	31.7%/35.98%
Average Precision (AP) @( IoU=0.50 )	maxDets=500	56.94%/61.64%
Average Precision (AP) @( IoU=0.75 )	maxDets=500	30.59%/36.13%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 1	0.67%/0.61%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 10	6.29%/6.13%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=100	38.66%/42.33%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=500	49.29%/53.68%

3. Model Ensemble (ResNet101-vd+ResNet50-vd)

Name	maxDets	Result
Average Precision (AP) @( IoU=0.50:0.95)	maxDets=500	36.76%
Average Precision (AP) @( IoU=0.50 )	maxDets=500	62.33%
Average Precision (AP) @( IoU=0.75 )	maxDets=500	37.41%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 1	0.59%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 10	6.06%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=100	42.57%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=500	54.53%

You can download trained weights(ResNet50vd, 101vd) on Baidu Yun (提取码: 9u9m), or Google Drive, then put it into ./saved_weights

Reference

1、https://github.com/DetectionTeamUCAS/Faster-RCNN_Tensorflow
2、https://github.com/open-mmlab/mmdetection
3、https://github.com/ZFTurbo/Weighted-Boxes-Fusion
4、https://github.com/kobiso/CBAM-tensorflow-slim
5、https://github.com/SJTU-Thinklab-Det/DOTA-DOAI
6、https://github.com/Viredery/tf-eager-fasterrcnn
7、https://github.com/VisDrone/VisDrone2018-DET-toolkit
8、https://github.com/YunYang1994/tensorflow-yolov3
9、https://github.com/zhpmatrix/VisDrone2018

This repo is customed for VisDrone.

Related tags

Overview

Object Detection for VisDrone(无人机航拍图像目标检测)

My environment

Datasets(XML format for training set)

Pretrained Models(ResNet50vd, 101vd)

Train

Eval

Visualization

Test Result(Validation set)：

1. ResNet50-vd

"s" means single-scale training + single-scale testing; "m"means multi-scale training + multi-scale testing

2. ResNet101-vd

3. Model Ensemble (ResNet101-vd+ResNet50-vd)

You can download trained weights(ResNet50vd, 101vd) on Baidu Yun (提取码: 9u9m), or Google Drive, then put it into ./saved_weights

Reference

Owner

Data Preparation, Processing, and Visualization for MoVi Data

Code for Neurips2021 Paper "Topology-Imbalance Learning for Semi-Supervised Node Classification".

Python package for dynamic system estimation of time series

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

Text-to-Image generation

Exploring Image Deblurring via Blur Kernel Space (CVPR'21)

Ground truth data for the Optical Character Recognition of Historical Classical Commentaries.

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Sentiment analysis translations of the Bhagavad Gita

Data augmentation for NLP, accepted at EMNLP 2021 Findings

QA-GNN: Question Answering using Language Models and Knowledge Graphs

Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

Reproduce partial features of DeePMD-kit using PyTorch.

Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

a Lightweight library for sequential learning agents, including reinforcement learning

A simple, fully convolutional model for real-time instance segmentation.

Clean Machine Learning, a Coding Kata

This is an official implementation for "Video Swin Transformers".

ICON: Implicit Clothed humans Obtained from Normals