Morphable Detector for Object Detection on Demand

Overview

Morphable Detector for Object Detection on Demand

(ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand.

teaser

If our project is helpful for your research, please consider citing:

@inproceedings{zhaomorph,
  author  = {Xiangyun Zhao, Xu Zou, Ying Wu},
  title   = {Morphable Detector for Object Detection on Demand},
  booktitle = {ICCV},
  Year  = {2021}
}

Install

First, install PyTorch and torchvision. We have tested on version of 1.8.0 with CUDA 11.0, but the other versions should also be working.

Our code is based on maskrcnn-benchmark, so you should install all dependencies.

Data Preparation

Download large scale few detection dataset here and covert the data into COCO dataset format. The file structure should look like:

  $ tree data
  dataset
  ├──fsod
      ├── annototation
      │   
      ├── images

Training (EM-like approach)

We follow FSOD Paper to pretrain the model using COCO dataset for 200,000 iterations. So, you can download the COCO pretrain model here, and use it to initilize the network.

We first initialize the prototypes using semantic vectors, then train the network run:

export NGPUS=2
RND_PORT=`shuf -i 4000-7999 -n 1`

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --master_port $RND_PORT --nproc_per_node=$NGPUS ./tools/train_sem_net.py \
--config-file "./configs/fsod/e2e_faster_rcnn_R_50_FPN_1x.yaml"  OUTPUT_DIR "YOUR_OUTPUT_PATH" \
MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN  2000 SOLVER.IMS_PER_BATCH 4 SOLVER.MAX_ITER 270000 \
SOLVER.STEPS "(50000,70000)" SOLVER.CHECKPOINT_PERIOD 10000 \
SOLVER.BASE_LR 0.002  

Then, to update the prototypes, we first extract the features for the training samples by running:

export NGPUS=2
RND_PORT=`shuf -i 4000-7999 -n 1`

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --master_port $RND_PORT --nproc_per_node=$NGPUS \
./tools/train_sem_net.py --config-file "./configs/fsod/e2e_faster_rcnn_R_50_FPN_1x.yaml"  \ 
FEATURE_DIR "features" OUTPUT_DIR "WHERE_YOU_SAVE_YOUR_MODEL" \
FEATURE_SIZE 200 SEM_DIR "visual_sem.txt" GET_FEATURE True \
MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN  2000 \
SOLVER.IMS_PER_BATCH 4 SOLVER.MAX_ITER 80000 \
SOLVER.CHECKPOINT_PERIOD 10000000

To compute the mean vectors and update the prototypes, run

cd features

python mean_features.py FEATURE_FILE MEAN_FEATURE_FILE
python update_prototype.py MEAN_FEATURE_FILE

To train the network using the updated prototypes, run

export NGPUS=2
RND_PORT=`shuf -i 4000-7999 -n 1`

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --master_port $RND_PORT --nproc_per_node=$NGPUS \
./tools/train_sem_net.py --config-file "./configs/fsod/e2e_faster_rcnn_R_50_FPN_1x.yaml"  \
SEM_DIR "PATH_WHERE_YOU_SAVE_THE_PROTOTYPES" VISUAL True OUTPUT_DIR "WHERE_YOU_SAVE_YOUR_MODEL" \ 
MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN  2000 SOLVER.IMS_PER_BATCH 4 \
SOLVER.MAX_ITER 70000 SOLVER.STEPS "(50000,80000)" \
SOLVER.CHECKPOINT_PERIOD 10000 \
SOLVER.BASE_LR 0.002 

Tests

After the model is trained, we randomly sample 5 samples for each novel category from the test data and use the mean feature vectors for the 5 samples as the prototype for that categpry. The results with different sample selection may vary a bit. To reproduce the results, we provide the features we extracted from our final model. But you can still extract your own features from your trained model.

To extract the features for test data, run

export NGPUS=2
RND_PORT=`shuf -i 4000-7999 -n 1`

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --master_port $RND_PORT --nproc_per_node=$NGPUS \
./tools/train_sem_net.py --config-file "./configs/fsod/e2e_faster_rcnn_R_50_FPN_1x.yaml"  \ 
FEATURE_DIR "features" OUTPUT_DIR "WHERE_YOU_SAVE_YOUR_MODEL" \
FEATURE_SIZE 200 SEM_DIR "visual_sem.txt" GET_FEATURE True \
MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN  2000 \
SOLVER.IMS_PER_BATCH 4 SOLVER.MAX_ITER 80000 \
SOLVER.CHECKPOINT_PERIOD 10000000

To compute the prototype for each class (online morphing), run

cd features

python mean_features.py FEATURE_FILE MEAN_FEATURE_FILE

Then run test,

export NGPUS=2
RND_PORT=`shuf -i 4000-7999 -n 1`

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --master_port $RND_PORT --nproc_per_node=$NGPUS ./tools/test_sem_net.py --config-file "./configs/fsod/e2e_faster_rcnn_R_50_FPN_1x.yaml" SEM_DIR WHERE_YOU_SAVE_THE_PROTOTYPES VISUAL True OUTPUT_DIR WHERE_YOU_SAVE_THE_MODEL MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000 FEATURE_SIZE 200 MODEL.ROI_BOX_HEAD.NUM_CLASSES 201 TEST_SCALE 0.7

Models

Our pre-trained ResNet-50 models can be downloaded as following:

name iterations AP AP^{0.5} model Mean Features
MD 70,000 22.2 37.9 download download
name iterations AP AP^{0.5} Mean Features
MD 1-shot 70,000 19.6 33.3 download
MD 2-shot 70,000 20.9 35.7 download
MD 5-shot 70,000 22.2 37.9 download
Owner
Ph.D. student at EECS department, Northwestern University
Advanced yabai wooting scripts

Yabai Wooting scripts Installation requirements Both https://github.com/xiamaz/python-yabai-client and https://github.com/xiamaz/python-wooting-rgb ne

Max Zhao 3 Dec 31, 2021
A research toolkit for particle swarm optimization in Python

PySwarms is an extensible research toolkit for particle swarm optimization (PSO) in Python. It is intended for swarm intelligence researchers, practit

Lj Miranda 1k Dec 30, 2022
Automated Attendance Project Using Face Recognition

dependencies for project: cmake 3.22.1 dlib 19.22.1 face-recognition 1.3.0 openc

Rohail Taha 1 Jan 09, 2022
Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

The Alan Turing Institute 112 Oct 23, 2022
The code of paper 'Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection'

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection Pytorch implemetation of paper 'Learning to Aggregate and Personalize

Tencent YouTu Research 136 Dec 29, 2022
Source code for Fixed-Point GAN for Cloud Detection

FCD: Fixed-Point GAN for Cloud Detection PyTorch source code of Nyborg & Assent (2020). Abstract The detection of clouds in satellite images is an ess

Joachim Nyborg 8 Dec 22, 2022
OSLO: Open Source framework for Large-scale transformer Optimization

O S L O Open Source framework for Large-scale transformer Optimization What's New: December 21, 2021 Released OSLO 1.0. What is OSLO about? OSLO is a

TUNiB 280 Nov 24, 2022
Mixup for Supervision, Semi- and Self-Supervision Learning Toolbox and Benchmark

OpenSelfSup News Downstream tasks now support more methods(Mask RCNN-FPN, RetinaNet, Keypoints RCNN) and more datasets(Cityscapes). 'GaussianBlur' is

AI Lab, Westlake University 332 Jan 03, 2023
This project hosts the code for implementing the ISAL algorithm for object detection and image classification

Influence Selection for Active Learning (ISAL) This project hosts the code for implementing the ISAL algorithm for object detection and image classifi

25 Sep 11, 2022
MohammadReza Sharifi 27 Dec 13, 2022
Test-Time Personalization with a Transformer for Human Pose Estimation, NeurIPS 2021

Transforming Self-Supervision in Test Time for Personalizing Human Pose Estimation This is an official implementation of the NeurIPS 2021 paper: Trans

41 Nov 28, 2022
Toolbox of models, callbacks, and datasets for AI/ML researchers.

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch Website • Installation • Main

Pytorch Lightning 1.4k Dec 30, 2022
Toward Multimodal Image-to-Image Translation

BicycleGAN Project Page | Paper | Video Pytorch implementation for multimodal image-to-image translation. For example, given the same night image, our

Jun-Yan Zhu 1.4k Dec 22, 2022
Localizing Visual Sounds the Hard Way

Localizing-Visual-Sounds-the-Hard-Way Code and Dataset for "Localizing Visual Sounds the Hard Way". The repo contains code and our pre-trained model.

Honglie Chen 58 Dec 07, 2022
Pytoydl: A toy deep learning framework built upon numpy.

Documents: https://pytoydl.readthedocs.io/zh/latest/ Pytoydl A toy deep learning framework built upon numpy. You can star this repository to keep trac

28 Dec 10, 2022
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Mohamed Afham 152 Dec 23, 2022
This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement This is the repository for the paper "Improving the Accuracy-Memory Trad

3 Dec 29, 2022
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous

Marco Cannici 21 Oct 05, 2022
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

凌逆战 16 Dec 30, 2022
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

2k Jan 05, 2023