Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Overview

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Alex Tamkin, Mike Wu, and Noah Goodman

Paper link: https://arxiv.org/abs/2010.07432

0) Background

Viewmaker networks are a new, more general method for self-supervised learning that enables pretraining with the same algorithm on a diverse range of different modalities—including images, speech, and sensor data.

Viewmaker networks learn a family of data transformations with a generative model, as opposed to prior approaches which use data transformations developed by domain experts through trial and error.

Viewmakers are trained adversarially with respect to the pretraining loss—this means they are compatible with many different pretraining objectives. We present results for SimCLR and InstDisc, but viewmakers are compatible with any view-based objective, including MoCo, BYOL, SimSiam, and SwAV.

Some example distortions learned for images (each frame is generated with a different random noise input to the viewmaker)

Image

1) Install Dependencies

We used the following PyTorch libraries for CUDA 10.1; you may have to adapt for your own CUDA version:

pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Install other dependencies:

pip install -r requirements.txt

2) Running experiments

Start by running

source init_env.sh

Now, you can run experiments for the different modalities as follows:

scripts/run_sensor.py config/sensor/pretrain_viewmaker_pamap2_simclr.json --gpu-device 0

This command runs viewmaker pretraining on the Pamap2 wearable sensor dataset using GPU #0. (If you have a multi-GPU node, you can specify other GPUs.)

The scripts directory holds:

  • run_image.py: for pretraining and running linear evaluation on CIFAR-10
  • run_meta_transfer.py: for running linear evaluation on a range of transfer datasets, including many from MetaDataset
  • run_audio.py: for pretraining on LibriSpeech and running linear evaluation on a range of transfer datasets
  • run_sensor.py: for pretraining on Pamap2 and running transfer, supervised, and semi-supervised learning on different splits of Pamap2
  • eval_cifar10_c.py: for evaluating a linear evaluation model on the CIFAR-10-C dataset for assessing robustness to common corruptions

The config directory holds configuration files for the different experiments, specifying the hyperparameters from each experiment. The first field in every config file is exp_base which specifies the base directory to save experiment outputs, which you should change for your own setup.

You are responsible for downloading the datasets. Update the paths in src/datasets/root_paths.py.

Training curves and other metrics are logged using wandb.ai

Owner
Alex Tamkin
PhD at @stanfordnlp
Alex Tamkin
Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

THUDM 1.3k Dec 29, 2022
[ECCV2020] Content-Consistent Matching for Domain Adaptive Semantic Segmentation

[ECCV20] Content-Consistent Matching for Domain Adaptive Semantic Segmentation This is a PyTorch implementation of CCM. News: GTA-4K list is available

Guangrui Li 88 Aug 25, 2022
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

Non-Metric Space Library (NMSLIB) Important Notes NMSLIB is generic but fast, see the results of ANN benchmarks. A standalone implementation of our fa

2.9k Jan 04, 2023
Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

Guan-Horng Liu 43 Jan 03, 2023
Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Multi-modal Interaction Graph Convolutioal Network for Temporal Language Localization in Videos Official implementation for Multi-Modal Interaction Gr

Zongmeng Zhang 15 Oct 18, 2022
Deep motion transfer

animation-with-keypoint-mask Paper The right most square is the final result. Softmax mask (circles): \ Heatmap mask: \ conda env create -f environmen

9 Nov 01, 2022
Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Text-AutoAugment (TAA) This repository contains the code for our paper Text AutoAugment: Learning Compositional Augmentation Policy for Text Classific

LancoPKU 105 Jan 03, 2023
A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models

This repository contains a collection of resources and papers on Diffusion Models and Score-based Models. If there are any missing valuable resources

5.1k Jan 08, 2023
Implementation of ViViT: A Video Vision Transformer

ViViT: A Video Vision Transformer Unofficial implementation of ViViT: A Video Vision Transformer. Notes: This is in WIP. Model 2 is implemented, Model

Rishikesh (ऋषिकेश) 297 Jan 06, 2023
Weakly Supervised End-to-End Learning (NeurIPS 2021)

WeaSEL: Weakly Supervised End-to-end Learning This is a PyTorch-Lightning-based framework, based on our End-to-End Weak Supervision paper (NeurIPS 202

Auton Lab, Carnegie Mellon University 131 Jan 06, 2023
Fast (simple) spectral synthesis and emission-line fitting of DESI spectra.

FastSpecFit Introduction This repository contains code and documentation to perform fast, simple spectral synthesis and emission-line fitting of DESI

5 Aug 02, 2022
PyTorch implementation of PNASNet-5 on ImageNet

PNASNet.pytorch PyTorch implementation of PNASNet-5. Specifically, PyTorch code from this repository is adapted to completely match both my implemetat

Chenxi Liu 314 Nov 25, 2022
PyTorch implementation of Off-policy Learning in Two-stage Recommender Systems

Off-Policy-2-Stage This repo provides a PyTorch implementation of the MovieLens experiments for the following paper: Off-policy Learning in Two-stage

Jiaqi Ma 25 Dec 12, 2022
Depth image based mouse cursor visual haptic

Depth image based mouse cursor visual haptic How to run it. Install pyqt5. Install python modules pip install Pillow pip install numpy For illustrati

Xiong Jie 17 Dec 20, 2022
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022
Clockwork Variational Autoencoder

Clockwork Variational Autoencoders (CW-VAE) Vaibhav Saxena, Jimmy Ba, Danijar Hafner If you find this code useful, please reference in your paper: @ar

Vaibhav Saxena 35 Nov 06, 2022
Display, filter and search log messages in your terminal

Textualog Display, filter and search logging messages in the terminal. This project is powered by rich and textual. Some of the ideas and code in this

Rik Huygen 24 Dec 10, 2022
Caffe implementation for Hu et al. Segmentation for Natural Language Expressions

Segmentation from Natural Language Expressions This repository contains the Caffe reimplementation of the following paper: R. Hu, M. Rohrbach, T. Darr

10 Jul 27, 2021
[Nature Machine Intelligence' 21] "Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence"

[UCADI] COVID-19 Diagnosis With Federated Learning Intro We developed a Federated Learning (FL) Framework for global researchers to collaboratively tr

HUST EIC AI-LAB 30 Dec 12, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

Jiachen Xu 5 Jul 14, 2022