Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

Last update: Dec 27, 2022

Related tags

Deep Learning InfoPro-Pytorch

Overview

InfoPro-Pytorch

The Information Propagation algorithm for training deep networks with local supervision.

(ICLR 2021) Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Update on 2021/01/25: Release Pre-trained models on ImageNet and Cityscapes.

Update on 2021/01/24: Release Code for Image Classification on CIFAR/SVHN/STL10/ImageNet and Semantic Segmentation on Cityscapes.

Introduction

We propose Information Propagation (InfoPro), a locally supervised deep learning algorithm, from the information-theoretic perspective. By splitting the whole deep network into multiple local modules and training them with local InfoPro loss, we reduce the GPU memory footprint by 40-60% without introducing notable extra computational cost or training time, but improve the performance moderately.

Citation

If you find this work valuable or use our code in your own research, please consider citing us with the following bibtex:

@inproceedings{wang2021revisiting,
        title = {Revisiting Locally Supervised Learning: an Alternative to End-to-end Training},
       author = {Yulin Wang and Zanlin Ni and Shiji Song and Le Yang and Gao Huang},
    booktitle = {International Conference on Learning Representations (ICLR)},
         year = {2021},
          url = {https://openreview.net/forum?id=fAbkE6ant2}
}

Get Started

Please go to the folder Experiments on CIFAR-SVHN-STL10, Experiments on ImageNet and Semantic segmentation for specific docs.

Results

CIFAR & STL-10

ImageNet

Semantic Segmentation

GPU Memory Cost

In the paper, we report the minimally required GPU memory to run the InfoPro* algorithm with torch.backends.cudnn.benchmark=True (for practical acceleration). Note that this result is (sometimes largely) different from what is printed by nvidia-smi.

Contact

This repo is a re-implementation of our original code. If you have any question, please feel free to contact the authors. Yulin Wang: [email protected].

Acknowledgments

Our code of Semantic Segmentation is from MMSegmentation. We highly appreciate their awesome work!

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

Related tags

Overview

InfoPro-Pytorch

Introduction

Citation

Get Started

Results

GPU Memory Cost

Contact

Acknowledgments

Owner

Codes for building and training the neural network model described in Domain-informed neural networks for interaction localization within astroparticle experiments.

Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

The VarCNN is an Convolution Neural Network based approach to automate Video Assistant Referee in football.

Official Python implementation of the 'Sparse deconvolution'-v0.3.0

A Learning-based Camera Calibration Toolbox

Implementation of Convolutional LSTM in PyTorch.

🔅 Shapash makes Machine Learning models transparent and understandable by everyone

Controlling a game using mediapipe hand tracking

When BERT Plays the Lottery, All Tickets Are Winning

Content shared at DS-OX Meetup

SlotRefine: A Fast Non-Autoregressive Model forJoint Intent Detection and Slot Filling

OpenLT: An open-source project for long-tail classification

Generative Models as a Data Source for Multiview Representation Learning

Type4Py: Deep Similarity Learning-Based Type Inference for Python

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution