Code for generating a single image pretraining dataset

Overview

Single Image Pretraining of Visual Representations

As shown in the paper

A critical analysis of self-supervision, or what we can learn from a single image, Asano et al. ICLR 2020

Example images from our dataset

Why?

Self-supervised representation learning has made enormous strides in recent years. In this paper we show that a large part why self-supervised learning works are the augmentations. We show this by pretraining various SSL methods on a dataset generated solely from augmenting a single source image and find that various methods still pretrain quite well and even yield representations as strong as using the whole dataset for the early layers of networks.

Abstract

We look critically at popular self-supervision techniques for learning deep convolutional neural networks without manual labels. We show that three different and representative methods, BiGAN, RotNet and DeepCluster, can learn the first few layers of a convolutional network from a single image as well as using millions of images and manual labels, provided that strong data augmentation is used. However, for deeper layers the gap with manual supervision cannot be closed even if millions of unlabelled images are used for training. We conclude that: (1) the weights of the early layers of deep networks contain limited information about the statistics of natural images, that (2) such low-level statistics can be learned through self-supervision just as well as through strong supervision, and that (3) the low-level statistics can be captured via synthetic transformations instead of using a large image dataset.

Usage

Here we provide the code for generating a dataset from using just a single source image. Since the publication, I have slightly modified the dataset generation script to make it easier to use. Dependencies: torch, torchvision, joblib, PIL, numpy, any recent version should do.

Run like this:

python make_dataset_single.py --imgpath images/ameyoko.jpg --targetpath ./out/ameyoko_dataset

Here is the full description of the usage:

usage: make_dataset_single.py [-h] [--img_size IMG_SIZE]
                              [--batch_size BATCH_SIZE] [--num_imgs NUM_IMGS]
                              [--threads THREADS] [--vflip] [--deg DEG]
                              [--shear SHEAR] [--cropfirst]
                              [--initcrop INITCROP] [--scale SCALE SCALE]
                              [--randinterp] [--imgpath IMGPATH] [--debug]
                              [--targetpath TARGETPATH]

Single Image Pretraining, Asano et al. 2020

optional arguments:
  -h, --help            show this help message and exit
  --img_size IMG_SIZE
  --batch_size BATCH_SIZE
  --num_imgs NUM_IMGS   number of images to be generated
  --threads THREADS     how many CPU threads to use for generation
  --vflip               use vflip?
  --deg DEG             max rot angle
  --shear SHEAR         max shear angle
  --cropfirst           usage of initial crop to not focus too much on center
  --initcrop INITCROP   initial crop size relative to image
  --scale SCALE SCALE   data augmentation inverse scale
  --randinterp          For RR crops: use random interpolation method or just bicubic?
  --imgpath IMGPATH
  --debug
  --targetpath TARGETPATH

Reference

If you find this code/idea useful, please consider citing our paper:

@inproceedings{asano2020a,
title={A critical analysis of self-supervision, or what we can learn from a single image},
author={Asano, Yuki M. and Rupprecht, Christian and Vedaldi, Andrea},
booktitle={International Conference on Learning Representations (ICLR)},
year={2020},
}
Owner
Yuki M. Asano
I'm a PhD student in the Visual Geometry Group at the University of Oxford. I work with @chrirupp and @vedaldi.
Yuki M. Asano
PyTorch implementation of "Continual Learning with Deep Generative Replay", NIPS 2017

pytorch-deep-generative-replay PyTorch implementation of Continual Learning with Deep Generative Replay, NIPS 2017 Results Continual Learning on Permu

Junsoo Ha 127 Dec 14, 2022
Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.

Meet Fidesops: Privacy as Code for DSAR Orchestration A part of the greater Fides ecosystem. ⚡ Overview Fidesops (fee-dez-äps, combination of the Lati

Ethyca 44 Dec 06, 2022
HomeAssitant custom integration for dyson

HomeAssistant Custom Integration for Dyson This custom integration is still under development. This is a HA custom integration for dyson. There are se

Xiaonan Shen 232 Dec 31, 2022
Code to reproduce results from the paper "AmbientGAN: Generative models from lossy measurements"

AmbientGAN: Generative models from lossy measurements This repository provides code to reproduce results from the paper AmbientGAN: Generative models

Ashish Bora 87 Oct 19, 2022
code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Facebook Research 94 Oct 26, 2022
University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

Music-Sentiment-Transfer University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN Poster: Music Sentiment Transfer

Miles Sigel 2 Jan 24, 2022
Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

Learning Associative Inference Using Fast Weight Memory This repository contains the offical code for the paper Learning Associative Inference Using F

Imanol Schlag 18 Oct 12, 2022
EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

EncT5 (Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks About Finetune T5 model for classification & r

Jangwon Park 34 Jan 01, 2023
A 10000+ hours dataset for Chinese speech recognition

WenetSpeech Official website | Paper A 10000+ Hours Multi-domain Chinese Corpus for Speech Recognition Download Please visit the official website, rea

310 Jan 03, 2023
An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

简介 通过PaddlePaddle框架复现了论文 Real-time Convolutional Neural Networks for Emotion and Gender Classification 中提出的两个模型,分别是SimpleCNN和MiniXception。利用 imdb_crop

8 Mar 11, 2022
An implementation of MobileFormer

MobileFormer An implementation of MobileFormer proposed by Yinpeng Chen, Xiyang Dai et al. Including [1] Mobile-Former proposed in:

slwang9353 62 Dec 28, 2022
Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

ABDALKARIM MOHTASIB 1 Jan 25, 2022
This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

DendroMap DendroMap is an interactive tool to explore large-scale image datasets used for machine learning. A deep understanding of your data can be v

DIV Lab 33 Dec 30, 2022
TensorFlow implementation of Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction)

Barlow-Twins-TF This repository implements Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction) in TensorFlow and demonstrat

Sayak Paul 36 Sep 14, 2022
This repository contains all source code, pre-trained models related to the paper "An Empirical Study on GANs with Margin Cosine Loss and Relativistic Discriminator"

An Empirical Study on GANs with Margin Cosine Loss and Relativistic Discriminator This is a Pytorch implementation for the paper "An Empirical Study o

Cuong Nguyen 3 Nov 15, 2021
Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

Big Vision This codebase is designed for training large-scale vision models on Cloud TPU VMs. It is based on Jax/Flax libraries, and uses tf.data and

Google Research 701 Jan 03, 2023
Implementation of Wasserstein adversarial attacks.

Stronger and Faster Wasserstein Adversarial Attacks Code for Stronger and Faster Wasserstein Adversarial Attacks, appeared in ICML 2020. This reposito

21 Oct 06, 2022
EsViT: Efficient self-supervised Vision Transformers

Efficient Self-Supervised Vision Transformers (EsViT) PyTorch implementation for EsViT, built with two techniques: A multi-stage Transformer architect

Microsoft 352 Dec 25, 2022
K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)

KCP The official implementation of KCP: k Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching, accepted for p

Yu-Kai Lin 109 Dec 14, 2022
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022