(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

Last update: Jan 03, 2023

Overview

Energy-based Latent Aligner for Incremental Learning

Accepted to CVPR 2022

We illustrate an Incremental Learning model trained on a continuum of tasks in the top part of the figure. While learning the current task $\tau_t$ , the latent representation of Task $\tau_{t-1}$ data gets disturbed, as shown by red arrows. ELI learns an energy manifold, and uses it to counteract this inherent representational shift, as illustrated by green arrows, thereby alleviating forgetting.

Overview

In this work, we propose ELI: Energy-based Latent Aligner for Incremental Learning, which:

Learns an energy manifold for the latent representations such that previous task latents will have low energy and the current task latents have high energy values.
This learned manifold is used to counter the representational shift that happens during incremental learning.

The implicit regularization that is offered by our proposed methodology can be used as a plug-and-play module in existing incremental learning methodologies for classification and object-detection.

Toy Experiment

A key hypothesis that we base our methodology is that while learning a new task, the latent representations will get disturbed, which will in-turn cause catastrophic forgetting of the previous task, and that an energy manifold can be used to align these latents, such that it alleviates forgetting.

Here, we illustrate a proof-of-concept that our hypothesis is indeed true. We consider a two task experiment on MNIST, where each task contains a subset of classes: $\tau_1$ = {0, 1, 2, 3, 4}, $\tau_2$ = {5, 6, 7, 8, 9}.

After learning the second task, the accuracy on $\tau_1$ test set drops to 20.88%, while experimenting with a 32 dimensional latent space. The latent aligner in ELI provides 62.56% improvement in test accuracy to 83.44%. The visualization of a 512 dimensional latent space after learning $\tau_2$ in sub-figure (c), indeed shows cluttering due to representational shift. ELI is able to align the latents as shown in sub-figure (d), which alleviates the drop in accuracy from 89.14% to 99.04%.

The code for these toy experiments are in:

Implicitly Recognizing and Aligning Important Latents

latents.mp4

Each row $i$ shows how $i^th$ latent dimension is updated by ELI. We see that different dimensions have different degrees of change, which is implicitly decided by our energy-based model.

Classification and Detection Experiments

Code and models for the classification and object detection experiments are inside the respective folders:

Each of these are independent repositories. Please consider them separate.

Citation

If you find our research useful, please consider citing us:

@inproceedings{joseph2022Energy,
  title={Energy-based Latent Aligner for Incremental Learning},
  author={Joseph, KJ and Khan, Salman and Khan, Fahad Shahbaz and Anwar, Rao Muhammad and Balasubramanian, Vineeth},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Our Related Work

Open-world Detection Transformer, CVPR 2022. Paper | Code
Towards Open World Object Detection, CVPR 2021. (Oral) Paper | Code
Incremental Object Detection via Meta-learning, TPAMI 2021. Paper | Code

(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

Related tags

Overview

Energy-based Latent Aligner for Incremental Learning

Accepted to CVPR 2022

Overview

Toy Experiment

Implicitly Recognizing and Aligning Important Latents

Classification and Detection Experiments

Citation

Our Related Work

Owner

Joseph K J

Metadata-Extractor - Metadata Extractor Script can be used to read in exif metadata

Code Release for Learning to Adapt to Evolving Domains

「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Dyalog-apl-docset - Dyalog APL Dash Docset Generator

Code of TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation

The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

Short and long time series classification using convolutional neural networks

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Fast and simple implementation of RL algorithms, designed to run fully on GPU.

Automated image registration. Registrationimation was too much of a mouthful.

A Machine Teaching Framework for Scalable Recognition

Segmentation vgg16 fcn - cityscapes

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation