PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Related tags

Deep LearningSDR
Overview

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations

This is the official PyTorch implementation of our work: "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations" published at CVPR 2021.

In this paper, we present some novel approaches constraining the feature space for continual learning semantic segmentation models. The evaluation on Pascal VOC2012 and on ADE20K validated our method.

Paper
5-min video
slides
poster
teaser

Requirements

This repository uses the following libraries:

  • Python (3.7.6)
  • Pytorch (1.4.0) [tested up to 1.7.1]
  • torchvision (0.5.0)
  • tensorboardX (2.0)
  • matplotlib (3.1.1)
  • numpy (1.18.1)
  • apex (0.1) [optional]
  • inplace-abn (1.0.7) [optional]

We also assume to have installed pytorch.distributed package.

All the dependencies are listed in the requirements.txt file which can be used in conda as:
conda create --name <env> --file requirements.txt

How to download data

In this project we use two dataset, ADE20K and Pascal-VOC 2012. We provide the scripts to download them in 'data/download_<dataset_name>.sh'. The script takes no inputs and you should use it in the target directory (where you want to download data).

How to perform training

The most important file is run.py, that is in charge to start the training or test procedure. To run it, simpy use the following command:

python -m torch.distributed.launch --nproc_per_node=<num_GPUs> run.py --data_root <data_folder> --name <exp_name> .. other args ..

The default is to use a pretraining for the backbone used, which is the one officially released by PyTorch models and it will be downloaded automatically. If you don't want to use pretrained, please use --no-pretrained.

There are many options and you can see them all by using --help option. Some of them are discussed in the following:

  • please specify the data folder using: --data_root <data_root>
  • dataset: --dataset voc (Pascal-VOC 2012) | ade (ADE20K)
  • task: --task <task>, where tasks are
    • 15-5, 15-5s, 19-1 (VOC), 100-50, 100-10, 50, 100-50b, 100-10b, 50b (ADE, b indicates the order)
  • step (each step is run separately): --step <N>, where N is the step number, starting from 0
  • (only for Pascal-VOC) disjoint is default setup, to enable overlapped: --overlapped
  • learning rate: --lr 0.01 (for step 0) | 0.001 (for step > 0)
  • batch size: --batch_size 8 (Pascal-VOC 2012) | 4 (ADE20K)
  • epochs: --epochs 30 (Pascal-VOC 2012) | 60 (ADE20K)
  • method: --method <method name>, where names are
    • FT, LWF, LWF-MC, ILT, EWC, RW, PI, MIB, CIL, SDR
      Note that method overwrites other parameters, but can be used as a kickstart to use default parameters for each method (see more on this in the hyperparameters section below)

For all the details please follow the information provided using the help option.

Example training commands

We provide some example scripts in the *.slurm and *.bat files.
For instance, to run the step 0 of 19-1 VOC2012 you can run:

python -u -m torch.distributed.launch 1> 'outputs/19-1/output_19-1_step0.txt' 2>&1 \
--nproc_per_node=1 run.py \
--batch_size 8 \
--logdir logs/19-1/ \
--dataset voc \
--name FT \
--task 19-1 \
--step 0 \
--lr 0.001 \
--epochs 30 \
--debug \
--sample_num 10 \
--unce \
--loss_de_prototypes 1 \
--where_to_sim GPU_windows

Note: loss_de_prototypes is set to 1 only for having the prototypes computed in the 0-th step (no distillation is actually computed of course).

Then, the step 1 of the same scenario can be computed simply as:

python -u -m torch.distributed.launch 1> 'outputs/19-1/output_19-1_step1.txt'  2>&1 \
--nproc_per_node=1 run.py \
--batch_size 8 \
--logdir logs/19-1/ \
--dataset voc \
--task 19-1 \
--step 1 \
--lr 0.0001 \
--epochs 30 \
--debug \
--sample_num 10 \
--where_to_sim GPU_windows \
--method SDR \
--step_ckpt 'logs/19-1/19-1-voc_FT/19-1-voc_FT_0.pth'

The results obtained are reported inside the outputs/ and logs/ folder, which can be downloaded here, and are 0.4% of mIoU higher than those reported in the main paper due to a slightly changed hyperparameter.

To run other approaches it is sufficient to change the --method parameter into one of the following: FT, LWF, LWF-MC, ILT, EWC, RW, PI, MIB, CIL, SDR.

Note: for the best results, the hyperparameters may change. Please see further details on the hyperparameters section below.

Once you trained the model, you can see the result on tensorboard (we perform the test after the whole training) or on the output files. or you can test it by using the same script and parameters but using the option

--test

that will skip all the training procedure and test the model on test data.

Do you want to try our constraints on your codebase or task?

If you want to try our novel constraints on your codebase or on a different problem you can check the utils/loss.py file. Here, you can take the definitions of the different losses and embed them into your codebase
The names of the variables could be interpreted as:

  • targets-- ground truth map,
  • outputs-- segmentation map output from the current network
  • outputs_old-- segmentation map output from the previous network
  • features-- features taken from the end of the currently-trained encoder,
  • features_old-- features taken from the end of the previous encoder [used for distillation on the encoder on ILT, but not used on SDR],
  • prototypes-- prototypical feature representations
  • incremental_step -- index of the current incremental step (0 if first non-incremental training is performed)
  • classes_old-- index of previous classes

Range for the Hyper-parameters

For what concerns the hyperparameters of our approach:

  • The parameter for the distillation loss is in the same range of that of MiB,
  • Prototypes matching: lambda was searched in range 1e-1 to 1e-3,
  • Contrastive learning (or clustering): lambda was searched in the range of 1e-2 to 1e-3,
  • Features sparsification: lambda was searched in the range of 1e-3 to 1e-5 A kick-start could be to use KD 10, PM 1e-2, CL 1e-3 and FS 1e-4.
    The best parameters may vary across datasets and incremental setup. However, we typically did a grid search and kept it fixed across learning steps.

So, writing explicitly all the parameters, the command would look something like the following:

python -u -m torch.distributed.launch 1> 'outputs/19-1/output_19-1_step1_custom.txt'  2>&1 \
--nproc_per_node=1 run.py \
--batch_size 8 \
--logdir logs/19-1/ \
--dataset voc \
--task 19-1 \
--step 1 \
--lr 0.0001 \
--epochs 30 \
--debug \
--sample_num 10 \
--where_to_sim GPU_windows \
--unce \
--loss_featspars $loss_featspars \
--lfs_normalization $lfs_normalization \
--lfs_shrinkingfn $lfs_shrinkingfn \
--lfs_loss_fn_touse $lfs_loss_fn_touse \
--loss_de_prototypes $loss_de_prototypes \
--loss_de_prototypes_sumafter \
--lfc_sep_clust $lfc_sep_clust \
--loss_fc $loss_fc \
--loss_kd $loss_kd \
--step_ckpt 'logs/19-1/19-1-voc_FT/19-1-voc_FT_0.pth'

Cite us

If you use this repository, please consider to cite

   @inProceedings{michieli2021continual,
   author = {Michieli, Umberto and Zanuttigh, Pietro},
   title  = {Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations},
   booktitle = {Computer Vision and Pattern Recognition (CVPR)},
   year      = {2021},
   month     = {June}
   }

And our previous works ILT and its journal extension.

Acknowledgements

We gratefully acknowledge the authors of MiB paper for the insightful discussion and for providing the open source codebase, which has been the starting point for our work.
We also acknowledge the authors of CIL for providing their code even before the official release.

Owner
Multimedia Technology and Telecommunication Lab
Department of Information Engineering, University of Padova
Multimedia Technology and Telecommunication Lab
PyTorch common framework to accelerate network implementation, training and validation

pytorch-framework PyTorch common framework to accelerate network implementation, training and validation. This framework is inspired by works from MML

Dongliang Cao 3 Dec 19, 2022
Spectrum is an AI that uses machine learning to generate Rap song lyrics

Spectrum Spectrum is an AI that uses deep learning to generate rap song lyrics. View Demo Report Bug Request Feature Open In Colab About The Project S

39 Dec 16, 2022
VLGrammar: Grounded Grammar Induction of Vision and Language

VLGrammar: Grounded Grammar Induction of Vision and Language

Yining Hong 27 Dec 23, 2022
A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

Meta Research 431 Jan 07, 2023
L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalabilty

Kim, Taehoon 102 Dec 21, 2022
Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Qingshan Xu 127 Dec 31, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 04, 2023
The code of NeurIPS 2021 paper "Scalable Rule-Based Representation Learning for Interpretable Classification".

Rule-based Representation Learner This is a PyTorch implementation of Rule-based Representation Learner (RRL) as described in NeurIPS 2021 paper: Scal

Zhuo Wang 53 Dec 17, 2022
Volsdf - Volume Rendering of Neural Implicit Surfaces

Volume Rendering of Neural Implicit Surfaces Project Page | Paper | Data This re

Lior Yariv 221 Jan 07, 2023
An example of semantic segmentation using tensorflow in eager execution.

Semantic segmentation using Tensorflow eager execution Requirement Python 2.7+ Tensorflow-gpu OpenCv H5py Scikit-learn Numpy Imgaug Train with eager e

Iñigo Alonso Ruiz 25 Sep 29, 2022
Özlem Taşkın 0 Feb 23, 2022
CaLiGraph Ontology as a Challenge for Semantic Reasoners ([email protected]'21)

CaLiGraph for Semantic Reasoning Evaluation Challenge This repository contains code and data to use CaLiGraph as a benchmark dataset in the Semantic R

Nico Heist 0 Jun 08, 2022
🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

🦕 nanosaur NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano Website: nanosaur.ai Do you need an help? Discord For tech

NanoSaur 162 Dec 09, 2022
This repository implements WGAN_GP.

Image_WGAN_GP This repository implements WGAN_GP. Image_WGAN_GP This repository uses wgan to generate mnist and fashionmnist pictures. Firstly, you ca

Lieon 6 Dec 10, 2021
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

Oliver Hahn 1 Jan 26, 2022
HyperaPy: An automatic hyperparameter optimization framework ⚡🚀

hyperpy HyperPy: An automatic hyperparameter optimization framework Description HyperPy: Library for automatic hyperparameter optimization. Build on t

Sergio Mora 7 Sep 06, 2022
Pre-Trained Image Processing Transformer (IPT)

Pre-Trained Image Processing Transformer (IPT) By Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Cha

HUAWEI Noah's Ark Lab 332 Dec 18, 2022
LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

TuSimple 295 Jan 05, 2023
Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields"

NeRF++ Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields" Work with 360 capture of large-scale unbounded scenes. Sup

Kai Zhang 722 Dec 28, 2022
Meta-meta-learning with evolution and plasticity

Evolve plastic networks to be able to automatically acquire novel cognitive (meta-learning) tasks

5 Jun 28, 2022