Semi-supervised learning for object detection

Overview

Source code for STAC: A Simple Semi-Supervised Learning Framework for Object Detection

STAC is a simple yet effective SSL framework for visual object detection along with a data augmentation strategy. STAC deploys highly confident pseudo labels of localized objects from an unlabeled image and updates the model by enforcing consistency via strong augmentation.

This code is only used for research. This is not an official Google product.

Instruction

Install dependencies

Set global enviroment variables.

export PRJROOT=/path/to/your/project/directory/STAC
export DATAROOT=/path/to/your/dataroot
export COCODIR=$DATAROOT/coco
export VOCDIR=$DATAROOT/voc
export PYTHONPATH=$PYTHONPATH:${PRJROOT}/third_party/FasterRCNN:${PRJROOT}/third_party/auto_augment:${PRJROOT}/third_party/tensorpack

Install virtual environment in the root folder of the project

cd ${PRJROOT}

sudo apt install python3-dev python3-virtualenv python3-tk imagemagick
virtualenv -p python3 --system-site-packages env3
. env3/bin/activate
pip install -r requirements.txt

# Make sure your tensorflow version is 1.14 not only in virtual environment but also in
# your machine, 1.15 can cause OOM issues.
python -c 'import tensorflow as tf; print(tf.__version__)'

# install coco apis
pip3 install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

(Optional) Install tensorpack

tensorpack with a compatible version is already included at third_party/tensorpack. bash cd ${PRJROOT}/third_party pip install --upgrade git+https://github.com/tensorpack/tensorpack.git

Download COCO/PASCAL VOC data and pre-trained models

Download data

See DATA.md

Download backbone model

cd ${COCODIR}
wget http://models.tensorpack.com/FasterRCNN/ImageNet-R50-AlignPadding.npz

Training

There are three steps:

  • 1. Train a standard detector on labeled data (detection/scripts/coco/train_stg1.sh).
  • 2. Predict pseudo boxes and labels of unlabeled data using the trained detector (detection/scripts/coco/eval_stg1.sh).
  • 3. Use labeled data and unlabeled data with pseudo labels to train a STAC detector (detection/scripts/coco/train_stg2.sh).

Besides instruction at here, detection/scripts/coco/train_stac.sh provides a combined script to train STAC.

detection/scripts/voc/train_stac.sh is a combined script to train STAC on PASCAL VOC.

The following example use labeled data as 10% train2017 and rest 90% train2017 data as unlabeled data.

Step 0: Set variables

cd ${PRJROOT}/detection

# Labeled and Unlabeled datasets
[email protected]
UNLABELED_DATASET=${DATASET}-unlabeled

# PATH to save trained models
CKPT_PATH=result/${DATASET}

# PATH to save pseudo labels for unlabeled data
PSEUDO_PATH=${CKPT_PATH}/PSEUDO_DATA

# Train with 8 GPUs
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

Step 1: Train FasterRCNN on labeled data

. scripts/coco/train_stg1.sh.

Set TRAIN.AUGTYPE_LAB=strong to apply strong data augmentation.

# --simple_path makes train_log/${DATASET}/${EXPNAME} as exact location to save
python3 train_stg1.py \
    --logdir ${CKPT_PATH} --simple_path --config \
    BACKBONE.WEIGHTS=${COCODIR}/ImageNet-R50-AlignPadding.npz \
    DATA.BASEDIR=${COCODIR} \
    DATA.TRAIN="('${DATASET}',)" \
    MODE_MASK=False \
    FRCNN.BATCH_PER_IM=64 \
    PREPROC.TRAIN_SHORT_EDGE_SIZE="[500,800]" \
    TRAIN.EVAL_PERIOD=20 \
    TRAIN.AUGTYPE_LAB='default'

Step 2: Generate pseudo labels of unlabeled data

. scripts/coco/eval_stg1.sh.

Evaluate using COCO metrics and save eval.json

# Check pseudo path
if [ ! -d ${PSEUDO_PATH} ]; then
    mkdir -p ${PSEUDO_PATH}
fi

# Evaluate the model for sanity check
# model-180000 is the last checkpoint
# save eval.json at $PSEUDO_PATH

python3 predict.py \
    --evaluate ${PSEUDO_PATH}/eval.json \
    --load "${CKPT_PATH}"/model-180000 \
    --config \
    DATA.BASEDIR=${COCODIR} \
    DATA.TRAIN="('${UNLABELED_DATASET}',)"

Generate pseudo labels for unlabeled data

Set EVAL.PSEUDO_INFERENCE=True to use original images rather than resized ones for inference.

# Extract pseudo label
python3 predict.py \
    --predict_unlabeled ${PSEUDO_PATH} \
    --load "${CKPT_PATH}"/model-180000 \
    --config \
    DATA.BASEDIR=${COCODIR} \
    DATA.TRAIN="('${UNLABELED_DATASET}',)" \
    EVAL.PSEUDO_INFERENCE=True

Step 3: Train STAC

. scripts/coco/train_stg2.sh.

The dataloader loads pseudo labels from ${PSEUDO_PATH}/pseudo_data.npy.

Apply default augmentation on labeled data and strong augmentation on unlabeled data.

TRAIN.CONFIDENCE and TRAIN.WU are two major parameters of the method.

python3 train_stg2.py \
    --logdir=${CKPT_PATH}/STAC --simple_path \
    --pseudo_path=${PSEUDO_PATH} \
    --config \
    BACKBONE.WEIGHTS=${COCODIR}/ImageNet-R50-AlignPadding.npz \
    DATA.BASEDIR=${COCODIR} \
    DATA.TRAIN="('${DATASET}',)" \
    DATA.UNLABEL="('${UNLABELED_DATASET}',)" \
    MODE_MASK=False \
    FRCNN.BATCH_PER_IM=64 \
    PREPROC.TRAIN_SHORT_EDGE_SIZE="[500,800]" \
    TRAIN.EVAL_PERIOD=20 \
    TRAIN.AUGTYPE_LAB='default' \
    TRAIN.AUGTYPE='strong' \
    TRAIN.CONFIDENCE=0.9 \
    TRAIN.WU=2

Tensorboard

All training logs and tensorboard info are under ${PRJROOT}/detection/train_log. Visualize using

tensorboard --logdir=${PRJROOT}/detection/train_log

Citation

@inproceedings{sohn2020detection,
  title={A Simple Semi-Supervised Learning Framework for Object Detection},
  author={Kihyuk Sohn and Zizhao Zhang and Chun-Liang Li and Han Zhang and Chen-Yu Lee and Tomas Pfister},
  year={2020},
  booktitle={arXiv:2005.04757}
}

Acknowledgement

Owner
Google Research
Google Research
A way to store images in YAML.

YAMLImg A way to store images in YAML. I made this after seeing Roadcrosser's JSON-G because it was too inspiring to ignore this opportunity. Installa

5 Mar 14, 2022
Convert Apple NeuralHash model for CSAM Detection to ONNX.

Apple NeuralHash is a perceptual hashing method for images based on neural networks. It can tolerate image resize and compression.

Asuhariet Ygvar 1.5k Dec 31, 2022
Computer Vision and Pattern Recognition, NUS CS4243, 2022

CS4243_2022 Computer Vision and Pattern Recognition, NUS CS4243, 2022 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : h

Xavier Bresson 142 Dec 15, 2022
Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Rank & Sort Loss for Object Detection and Instance Segmentation The official implementation of Rank & Sort Loss. Our implementation is based on mmdete

Kemal Oksuz 229 Dec 20, 2022
A keras implementation of ENet (abandoned for the foreseeable future)

ENet-keras This is an implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from ENet-training (lua-t

Pavlos 115 Nov 23, 2021
INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing Existing studies on semantic parsing focus primarily on mapping a natural-la

7 Aug 22, 2022
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022
Hitters Linear Regression - Hitters Linear Regression With Python

Hitters_Linear_Regression Kullanacağımız veri seti Carnegie Mellon Üniversitesi'

AyseBuyukcelik 2 Jan 26, 2022
Instant neural graphics primitives: lightning fast NeRF and more

Instant Neural Graphics Primitives Ever wanted to train a NeRF model of a fox in under 5 seconds? Or fly around a scene captured from photos of a fact

NVIDIA Research Projects 10.6k Jan 01, 2023
Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

Jia Research Lab 116 Dec 20, 2022
Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021) The implementation of Reducing Infromation Bottleneck for W

Jungbeom Lee 81 Dec 16, 2022
PIKA: a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi

PIKA: a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi PIKA is a lightweight speech processing toolkit based on Pytorch and (Py)

336 Nov 25, 2022
Use AI to generate a optimized stock portfolio

Use AI, Modern Portfolio Theory, and Monte Carlo simulation's to generate a optimized stock portfolio that minimizes risk while maximizing returns. Ho

Greg James 30 Dec 22, 2022
Python implementation of Lightning-rod Agent, the Stack4Things board-side probe

Iotronic Lightning-rod Agent Python implementation of Lightning-rod Agent, the Stack4Things board-side probe. Free software: Apache 2.0 license Websit

2 May 19, 2022
Code release for The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification (TIP 2020)

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification Code release for The Devil is in the Channels: Mutual-Channel

PRIS-CV: Computer Vision Group 230 Dec 31, 2022
Official implementation of Rich Semantics Improve Few-Shot Learning (BMVC, 2021)

Rich Semantics Improve Few-Shot Learning Paper Link Abstract : Human learning benefits from multi-modal inputs that often appear as rich semantics (e.

Mohamed Afham 11 Jul 26, 2022
Official code of Team Yao at Multi-Modal-Fact-Verification-2022

Official code of Team Yao at Multi-Modal-Fact-Verification-2022 A Multi-Modal Fact Verification dataset released as part of the De-Factify workshop in

Wei-Yao Wang 11 Nov 15, 2022
Dynamic Slimmable Network (CVPR 2021, Oral)

Dynamic Slimmable Network (DS-Net) This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral). Architecture of DS-

Changlin Li 197 Dec 09, 2022
CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021 How to cite If you use these data please cite the o

Digital Linguistics 2 Dec 20, 2021
A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

MADGRAD Optimization Method A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization pip install madgrad Try it out! A best

Meta Research 774 Dec 31, 2022