Learning where to learn - Gradient sparsity in meta and continual learning

Last update: Dec 09, 2022

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

In this paper, we investigate gradient sparsity found by MAML in various continual and few-shot learning scenarios.
Instead of only learning the initialization of neural network parameters, we additionally meta-learn parameters underneath a step function that stops gradient descent when smaller then 0.

We term this version Sparse-MAML - Link to the paper here.

Interestingly, we see that structured sparsity emerges in both the classic 4-layer ConvNet as well as a ResNet-12 for few-shot learning. This is accompanied by improved robustness and generalisation across many hyperparameters.

Note that Sparse-MAML is an extremely simple variant of MAML that possesses only the possibility to shut on/off training of specific parameters compared to proper gradient modulation.

This codebase implents the few-shot learning experiments that are presented in the paper. To reproduce the results in the paper, please follow these instructions:

Installation

#1. Install a conda env:

conda create -n sparse-MAML

#2. Activate the env:

source activate sparse-MAML

#3. Install anaconda:

conda install anaconda

#4. Install extra requiremetns (make sure you use the correct pip3):

pip3 install -r requirements.txt

#5. Run:

chmod u+x run_sparse_MAML.sh

#6. Execute:

./run_sparse_MAML.sh

Results

MiniImageNet Few-Shot	MAML	ANIL	BOIL	sparse-MAML	sparse-ReLU-MAML
5-way 5-shot \| ConvNet	63.15	61.50	66.45	67.03	64.84
5-way 1-shot \| ConvNet	48.07	46.70	49.61	50.35	50.39
5-way 5-shot \| ResNet12	69.36	70.03	70.50	70.02	73.01
5-way 1-shot \| ResNet12	53.91	55.25	-	55.02	56.39

BOIL results are taken from the original paper.

This code based is heavily build on top of torchmeta.

Learning where to learn - Gradient sparsity in meta and continual learning

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

Installation

Results

Owner

Johannes Oswald

Code for the Paper "Diffusion Models for Handwriting Generation"

Detecting Blurred Ground-based Sky/Cloud Images

Implementation of Pooling by Sliced-Wasserstein Embedding (NeurIPS 2021)

Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

A Transformer-Based Siamese Network for Change Detection

Efficient 3D Backbone Network for Temporal Modeling

A simple baseline for 3d human pose estimation in PyTorch.

Doing the asl sign language classification on static images using graph neural networks.

Neuralnetwork - Basic Multilayer Perceptron Neural Network for deep learning

A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Deep Reinforcement Learning with pytorch & visdom

A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

Classical OCR DCNN reproduction based on PaddlePaddle framework.

Speech Recognition using DeepSpeech2.

Personal project about genus-0 meshes, spherical harmonics and a cow

Transparent Transformer Segmentation

YuNetのPythonでのONNX、TensorFlow-Lite推論サンプル

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences