Code for generating a single image pretraining dataset

Overview

Single Image Pretraining of Visual Representations

As shown in the paper

A critical analysis of self-supervision, or what we can learn from a single image, Asano et al. ICLR 2020

Example images from our dataset

Why?

Self-supervised representation learning has made enormous strides in recent years. In this paper we show that a large part why self-supervised learning works are the augmentations. We show this by pretraining various SSL methods on a dataset generated solely from augmenting a single source image and find that various methods still pretrain quite well and even yield representations as strong as using the whole dataset for the early layers of networks.

Abstract

We look critically at popular self-supervision techniques for learning deep convolutional neural networks without manual labels. We show that three different and representative methods, BiGAN, RotNet and DeepCluster, can learn the first few layers of a convolutional network from a single image as well as using millions of images and manual labels, provided that strong data augmentation is used. However, for deeper layers the gap with manual supervision cannot be closed even if millions of unlabelled images are used for training. We conclude that: (1) the weights of the early layers of deep networks contain limited information about the statistics of natural images, that (2) such low-level statistics can be learned through self-supervision just as well as through strong supervision, and that (3) the low-level statistics can be captured via synthetic transformations instead of using a large image dataset.

Usage

Here we provide the code for generating a dataset from using just a single source image. Since the publication, I have slightly modified the dataset generation script to make it easier to use. Dependencies: torch, torchvision, joblib, PIL, numpy, any recent version should do.

Run like this:

python make_dataset_single.py --imgpath images/ameyoko.jpg --targetpath ./out/ameyoko_dataset

Here is the full description of the usage:

usage: make_dataset_single.py [-h] [--img_size IMG_SIZE]
                              [--batch_size BATCH_SIZE] [--num_imgs NUM_IMGS]
                              [--threads THREADS] [--vflip] [--deg DEG]
                              [--shear SHEAR] [--cropfirst]
                              [--initcrop INITCROP] [--scale SCALE SCALE]
                              [--randinterp] [--imgpath IMGPATH] [--debug]
                              [--targetpath TARGETPATH]

Single Image Pretraining, Asano et al. 2020

optional arguments:
  -h, --help            show this help message and exit
  --img_size IMG_SIZE
  --batch_size BATCH_SIZE
  --num_imgs NUM_IMGS   number of images to be generated
  --threads THREADS     how many CPU threads to use for generation
  --vflip               use vflip?
  --deg DEG             max rot angle
  --shear SHEAR         max shear angle
  --cropfirst           usage of initial crop to not focus too much on center
  --initcrop INITCROP   initial crop size relative to image
  --scale SCALE SCALE   data augmentation inverse scale
  --randinterp          For RR crops: use random interpolation method or just bicubic?
  --imgpath IMGPATH
  --debug
  --targetpath TARGETPATH

Reference

If you find this code/idea useful, please consider citing our paper:

@inproceedings{asano2020a,
title={A critical analysis of self-supervision, or what we can learn from a single image},
author={Asano, Yuki M. and Rupprecht, Christian and Vedaldi, Andrea},
booktitle={International Conference on Learning Representations (ICLR)},
year={2020},
}
Owner
Yuki M. Asano
I'm a PhD student in the Visual Geometry Group at the University of Oxford. I work with @chrirupp and @vedaldi.
Yuki M. Asano
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Self-Supervised Vision Transformers with DINO PyTorch implementation and pretrained models for DINO. For details, see Emerging Properties in Self-Supe

Facebook Research 4.2k Jan 03, 2023
ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

Katherine Crowson 53 Dec 29, 2022
DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

Generative Image Inpainting An open source framework for generative image inpainting task, with the support of Contextual Attention (CVPR 2018) and Ga

2.9k Dec 16, 2022
Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

Katsuya Hyodo 16 Dec 22, 2022
Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 01, 2023
Cockpit is a visual and statistical debugger specifically designed for deep learning.

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

Felix Dangel 421 Dec 29, 2022
Hierarchical probabilistic 3D U-Net, with attention mechanisms (β€”π˜ˆπ˜΅π˜΅π˜¦π˜―π˜΅π˜ͺ𝘰𝘯 𝘜-π˜•π˜¦π˜΅, π˜šπ˜Œπ˜™π˜¦π˜΄π˜•π˜¦π˜΅) and a nested decoder structure with deep supervision (β€”π˜œπ˜•π˜¦π˜΅++).

Hierarchical probabilistic 3D U-Net, with attention mechanisms (β€”π˜ˆπ˜΅π˜΅π˜¦π˜―π˜΅π˜ͺ𝘰𝘯 𝘜-π˜•π˜¦π˜΅, π˜šπ˜Œπ˜™π˜¦π˜΄π˜•π˜¦π˜΅) and a nested decoder structure with deep supervision (β€”π˜œπ˜•π˜¦π˜΅++). Built in TensorFlow 2.5. Configured for vox

Diagnostic Image Analysis Group 32 Dec 08, 2022
Code for ACL'2021 paper WARP πŸŒ€ Word-level Adversarial ReProgramming

Code for ACL'2021 paper WARP πŸŒ€ Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification.

YerevaNN 75 Nov 06, 2022
A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch The official pytorch implementation of the paper "Towards Faster and Stabilize

Bingchen Liu 455 Jan 08, 2023
Indices Matter: Learning to Index for Deep Image Matting

IndexNet Matting This repository includes the official implementation of IndexNet Matting for deep image matting, presented in our paper: Indices Matt

Hao Lu 357 Nov 26, 2022
The implementation for "Comprehensive Knowledge Distillation with Causal Intervention".

Comprehensive Knowledge Distillation with Causal Intervention This repository is a PyTorch implementation of "Comprehensive Knowledge Distillation wit

Xiang Deng 10 Nov 03, 2022
[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

CaaM This repo contains the codes of training our CaaM on NICO/ImageNet9 dataset. Due to my recent limited bandwidth, this codebase is still messy, wh

Wang Tan 66 Dec 31, 2022
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

CVSM Group - email: <a href=[email protected]"> 188 Dec 12, 2022
Clean Machine Learning, a Coding Kata

Kata: Clean Machine Learning From Dirty Code First, open the Kata in Google Colab (or else download it) You can clone this project and launch jupyter-

Neuraxio 13 Nov 03, 2022
Simple tools for logging and visualizing, loading and training

TNT TNT is a library providing powerful dataloading, logging and visualization utilities for Python. It is closely integrated with PyTorch and is desi

1.5k Jan 02, 2023
A DCGAN to generate anime faces using custom mined dataset

Anime-Face-GAN-Keras A DCGAN to generate anime faces using custom dataset in Keras. Dataset The dataset is created by crawling anime database websites

Pavitrakumar P 190 Jan 03, 2023
CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

CvT2DistilGPT2 Improving Chest X-Ray Report Generation by Leveraging Warm-Starting This repository houses the implementation of CvT2DistilGPT2 from [1

The Australian e-Health Research Centre 21 Dec 28, 2022
Code for Environment Inference for Invariant Learning (ICML 2020 UDL Workshop Paper)

Environment Inference for Invariant Learning This code accompanies the paper Environment Inference for Invariant Learning, which appears at ICML 2021.

Elliot Creager 40 Dec 09, 2022
Website which uses Deep Learning to generate horror stories.

Creepypasta - Text Generator Website which uses Deep Learning to generate horror stories. View Demo Β· View Website Repo Β· Report Bug Β· Request Feature

Dhairya Sharma 5 Oct 14, 2022
KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

80 Dec 27, 2022