RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

Overview

🚀 RetinaNet Horizontal Detector Based PyTorch

This is a horizontal detector RetinaNet implementation on remote sensing ship dataset (SSDD).
This re-implemented retinanet has the almost the same mAP(iou=0.25, score_iou=0.15) with the MMdetection.
RetinaNet Detector original paper link is here.

🌟 Performance of the implemented RetinaNet Detector

Detection Performance on Inshore image.

Detection Performance on Offshore image.

🎯 Experiment

The SSDD dataset, well-trained retinanet detector, resnet-50 pretrained model on ImageNet, loss curve, evaluation metrics results are below, you could follow my experiment.

  • SSDD dataset BaiduYun extraction code=pa8j
  • gt labels for eval data set BaiduYun extraction code=vqaw (ground-truth)
  • gt labels for train data set BaiduYun extraction code=datk (train-ground-truth)
  • well-trained retinanet detector weight file BaiduYun extraction code=b0e1
  • pre-trained ImageNet resnet-50 weight file BaiduYun extraction code=mmql
  • evaluation metrics(iou=0.25, score_iou=0.15)
Batch Size Input Size mAP (Mine) mAP (MMdet) Model Parameters
32 416 x 416 0.8828 0.8891 32.2 M
  • Other metrics (Precision/Recall/F1 score)
Precision (Mine) Precision (MMDet) Recall (Mine) Recall (MMdet) F1 score (Mine) F1 score(MMdet)
0.8077 0.8502 0.9062 0.91558 0.8541 0.8817
  • loss curve

  • mAP metrics on training set and val set

  • learning rate curve (using warmup lr rate)

💥 Get Started

Installation

A. Install requirements:

conda create -n retinanet python=3.7
conda activate retinanet
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
pip install -r requirements.txt  

Note: If you meet some troubles about installing environment, you can see the check.txt for more details.

B. Install nms module:

cd utils/HBB_NMS_GPU
make

Demo

A. Set project's data path

you should set project's data path in config.py first.

# config.py
# Note: all the path should be absolute path.  
data_path = r'/$ROOT_PATH/SSDD_data/'  # absolute data root path  
output_path = r'/$ROOT_PATH/Output/'  # absolute model output path  
  
inshore_data_path = r'/$ROOT_PATH/SSDD_data_InShore/'  # absolute Inshore data path  
offshore_data_path = r'/$ROOT_PATH/SSDD_data_OffShore/'  # absolute Offshore data path  

# An example  
$ROOT_PATH
    -SSDD_data/
        -train/  # train set 
	    -*.jpg
	-val/  # val set
	    -*.jpg
	-annotations/  # gt label in json format (for coco evaluation method)  
	    -instances_train.json  
	    -instances_val.json  
	-ground-truth/  
	    -*.txt  # gt label in txt format (for voc evaluation method and evaluae inshore and offshore scence)  
	-train-ground-truth/
	    -*.txt  # gt label in txt format (for voc evaluation method)
    -SSDD_data_InShore/  
        -images/
	    -*.jpg  # inshore scence images
	-ground-truth/
	    -*.txt  # inshore scence gt labels  
    -SSDD_data_OffShore/  
        -images/  
	    -*.jpg  # offshore scence images
	-ground-truth/  
	    -*.txt  # offshore scence gt labels

    -Output/
        -checkpoints/
	    - the path of saving tensorboard log event
	-evaluate/  
	    - the path of saving model detection results for evaluate (coco/voc/inshore/offshore)  

B. you should download the well-trained SSDD Dataset weight file.

# download and put the well-trained pth file in checkpoints/ folder 
# and run the simple inferene script to get detection result  
# you can find the model output predict.jpg in show_result/ folder.  

python show.py --chkpt 54_1595.pth --result_path show_result --pic_name demo1.jpg  

Train

A. Prepare dataset

you should structure your dataset files as shown above.

B. Manual set project's hyper parameters

you should manual set projcet's hyper parameters in config.py

1. data file structure (Must Be Set !)  
   has shown above.  

2. Other settings (Optional)  
   if you want to follow my experiment, dont't change anything.  

C. Train RetinaNet detector on SSDD dataset with pretrianed resnet-50 from scratch

C.1 Download the pre-trained resnet-50 pth file

you should download the pre-trained ImageNet Dataset resnet-50 pth file first and put this pth file in resnet_pretrained_pth/ folder.

C.2 Train RetinaNet Detector on SSDD Dataset with pre-trained pth file

# with batchsize 32 and using voc evaluation method during training for 50 epochs  
python train.py --batch_size 32 --epoch 50 --eval_method voc  
  
# with batchsize 32 and using coco evalutation method during training for 50 epochs  
python train.py --batch_size 32 --epoch 50 --eval_method coco  

Note: If you find classification loss change slowly, please be patient, it's not a mistake.

Evaluation

A. evaluate model performance on val set.

python eval.py --device 0 --evaluate True --FPS False --Offshore False --Inshore False --chkpt 54_1595.pth

B. evaluate model performance on InShore and Offshore sences.

python eval.py --device 0 --evaluate False --FPS False --Offshore True --Inshore True --chkpt 54_1595.pth

C. evaluate model FPS

python eval.py --device 0 --evaluate False --FPS True --Offshore False --Inshore Fasle --chkpt 54_1595.pth

💡 Inferences

Thanks for these great work.
https://github.com/ming71/DAL
https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch

Owner
Fang Zhonghao
Email:[email protected]
Fang Zhonghao
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022
Generating Digital Painting Lighting Effects via RGB-space Geometry (SIGGRAPH2020/TOG2020)

Project PaintingLight PaintingLight is a project conducted by the Style2Paints team, aimed at finding a method to manipulate the illumination in digit

651 Dec 29, 2022
Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

This repo contains the implementations of Object DGCNN (https://arxiv.org/abs/2110.06923) and DETR3D (https://arxiv.org/abs/2110.06922). Our implementations are built on top of MMdetection3D.

Wang, Yue 539 Jan 07, 2023
Walk with fastai

Shield: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Walk with fastai What is this p

Walk with fastai 124 Dec 10, 2022
Inhomogeneous Social Recommendation with Hypergraph Convolutional Networks

Inhomogeneous Social Recommendation with Hypergraph Convolutional Networks This is our Pytorch implementation for the paper: Zirui Zhu, Chen Gao, Xu C

Zirui Zhu 3 Dec 30, 2022
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Decision Transformer Lili Chen*, Kevin Lu*, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas†, and Igor M

Kevin Lu 1.4k Jan 07, 2023
VOLO: Vision Outlooker for Visual Recognition

VOLO: Vision Outlooker for Visual Recognition, arxiv This is a PyTorch implementation of our paper. We present Vision Outlooker (VOLO). We show that o

Sea AI Lab 876 Dec 09, 2022
PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime Created by Prarthana Bhattacharyya. Disclaimer: This is n

Prarthana Bhattacharyya 5 Nov 08, 2022
alfred-py: A deep learning utility library for **human**

Alfred Alfred is command line tool for deep-learning usage. if you want split an video into image frames or combine frames into a single video, then a

JinTian 800 Jan 03, 2023
Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa

Robin Jia 38 Oct 16, 2022
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

WangRui 46 Dec 29, 2022
Active window border replacement for window managers.

xborder Active window border replacement for window managers. Usage git clone https://github.com/deter0/xborder cd xborder chmod +x xborders ./xborder

deter 250 Dec 30, 2022
RNN Predict Street Commercial Vitality

RNN-for-Predicting-Street-Vitality Code and dataset for Predicting the Vitality of Stores along the Street based on Business Type Sequence via Recurre

Zidong LIU 1 Dec 15, 2021
Relative Human dataset, CVPR 2022

Relative Human (RH) contains multi-person in-the-wild RGB images with rich human annotations, including: Depth layers (DLs): relative depth relationsh

Yu Sun 112 Dec 02, 2022
This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.

Data-Science-Intern-Challenge This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge. Summer 2022 Data Science Inte

1 Jan 11, 2022
Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

arXiv, porject page, paper Blind Image Decomposition (BID) Blind Image Decomposition is a novel task. The task requires separating a superimposed imag

64 Dec 20, 2022
Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

Re-implementation of the paper 'Grokking: Generalization beyond overfitting on small algorithmic datasets' Paper Original paper can be found here Data

Tom Lieberum 38 Aug 09, 2022
A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

CFN-SR A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION The audio-video based multimodal

skeleton 15 Sep 26, 2022
This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

ERFNet This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation. NEW!! New PyTorch

Edu 104 Jan 05, 2023
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022