Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"

Overview

Synthetic dataset rendering

Framework for producing the synthetic datasets used in:

How Useful is Self-Supervised Pretraining for Visual Tasks?
Alejandro Newell and Jia Deng. CVPR, 2020. arXiv:2003.14323

Experiment code can be found here.

This is a general purpose synthetic setting supporting single-object or multi-object images providing annotations for object classification, object pose estimation, segmentation, and depth estimation.

Setup

Download and set up Blender 2.80 (this code has not been tested on more recent Blender versions).

Blender uses its own Python, to which we need to add an extra package. In the Blender installation, find the python directory and run:

cd path/to/blender/2.80/python/bin
./python3.7m -m ensure pip
./pip3 install gin_config

For distributed rendering and additional dataset prep, use your own Python installation (not the Blender version). Everything was tested with Python 3.7 and the following extra packages:

sudo apt install libopenexr-dev
pip install ray ray[tune] h5py openexr scikit-image

External data

Download ShapeNetCore.v2 and DTD.

By default, it is assumed external datasets will be placed in syn_benchmark/datasets (e.g. syn_benchmark/datasets/ShapeNetCore.v2). If this is not the case, change any paths as necessary in paths.py.

Dataset Generation

Try a test run with:

blender --background --python render.py -- -d test_dataset

The argument -d, --dataset_name specifies the output directory which will be placed in the directory defined by pahs.DATA_DIR. Dataset settings can be modified either by selecting a gin config file (-g) or by modifying parameters (-p), for example:

blender --background --python render.py -- -g render_multi
blender --background --python render.py -- -p "material.use_texture = False" "object.random_viewpoint = 0"
blender --background --python render.py -- -g render_multi -p "batch.num_samples = 100"

Manual arguments passed in through -p will override those in the provided gin file. Please check out config/render_single.gin to see what options can be modified.

Distributed rendering

To scale up dataset creation, rendering is split into smaller jobs that can be sent out to individual workers for parallelization on a single machine or on a cluster. The library Ray is used to manage workers automatically. This allows large-scale distributed, parallel processes which are easy to restart in case anything crashes.

Calling python distributed_render.py will by default produce small versions of the 12 single-object datasets used in the paper. Arguments are available to control the overall dataset size and to interface with Ray. The script can be modified as needed to produce individual datasets or to modify dataset properties (e.g. texture, lighting, etc).

To produce multi-object images with depth and segmentation ground truth, add the argument --is_multi.

Further processing

After running the rendering script, you will be left with a large number of individual files containing rendered images and metadata pertaining to class labels and other scene information. Before running the main experiment code it is important that this data is preprocessed.

There are two key steps:

  • consolidation of raw data to HDF5 datasets: python preprocess_data.py -d test_dataset -f
  • image resizing and preprocessing: python preprocess_data.py -d test_dataset -p

If working with EXR images produced for segmentation/depth data make sure to add the argument -e.

-f, --to_hdf5: The first step will move all image files and metadata into HDF5 dataset files.

An important step that occurs here is conversion of EXR data to PNG data. The EXR output from Blender contains both the rendered image and corresponding depth, instance segmentation, and semantic segmentation data. After running this script, the rendered image is stored as one PNG and the depth and segmentation channels are concatenated into another PNG image.

After this step, I recommend removing the original small files if disk space is a concern, all raw data is fully preserved in the img_XX.h5 files. Note, the data is stored as an encoded PNG, if you want to read the image into Python you can do the following:

f = h5py.File('path/to/your/dataset/imgs_00.h5', 'r')
img_idx = 0
png_data = f['png_images'][img_idx]

img = imageio.imread(io.BytesIO(png_data))
# or alternatively
img = util.img_read_and_resize(png_data)

-p, --preprocess: Once the raw data has been moved into HDF5 files, it can be quickly processed for use in experiments. This preprocessing simply takes care of steps that would otherwise be performed over and over again during training such as image resizing and normalization. One of the more expensive steps that is taken care of here is conversion to LAB color space.

This preprocessing step prepares a single HDF5 file which ready to be used with the experiment code. Unlike the files created in the previous step, this data has been processed and some information may be lost from the original images especially if they have been resized to a lower resolution.

Owner
Princeton Vision & Learning Lab
Princeton Vision & Learning Lab
Wav2Vec for speech recognition, classification, and audio classification

Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your

Mehrdad Farahani 140 Dec 15, 2022
PyTorch implementation of InstaGAN: Instance-aware Image-to-Image Translation

InstaGAN: Instance-aware Image-to-Image Translation Warning: This repo contains a model which has potential ethical concerns. Remark that the task of

Sangwoo Mo 827 Dec 29, 2022
MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;

MoViNet-pytorch Pytorch unofficial implementation of MoViNets: Mobile Video Networks for Efficient Video Recognition. Authors: Dan Kondratyuk, Liangzh

189 Dec 20, 2022
A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

Aladdin Persson 4.7k Jan 08, 2023
PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

Zixuan Ke 176 Jan 05, 2023
Relative Uncertainty Learning for Facial Expression Recognition

Relative Uncertainty Learning for Facial Expression Recognition The official implementation of the following paper at NeurIPS2021: Title: Relative Unc

35 Dec 28, 2022
The implementation of DeBERTa

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

Microsoft 1.2k Jan 06, 2023
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

Keon Lee 13 Dec 05, 2022
Node Dependent Local Smoothing for Scalable Graph Learning

Node Dependent Local Smoothing for Scalable Graph Learning Requirements Environments: Xeon Gold 5120 (CPU), 384GB(RAM), TITAN RTX (GPU), Ubuntu 16.04

Wentao Zhang 15 Nov 28, 2022
HybVIO visual-inertial odometry and SLAM system

HybVIO A visual-inertial odometry system with an optional SLAM module. This is a research-oriented codebase, which has been published for the purposes

Spectacular AI 320 Jan 03, 2023
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
Hierarchical Few-Shot Generative Models

Hierarchical Few-Shot Generative Models Giorgio Giannone, Ole Winther This repo contains code and experiments for the paper Hierarchical Few-Shot Gene

Giorgio Giannone 6 Dec 12, 2022
Custom studies about block sparse attention.

Block Sparse Attention 研究总结 本人近半年来对Block Sparse Attention(块稀疏注意力)的研究总结(持续更新中)。按时间顺序,主要分为如下三部分: PyTorch 自定义 CUDA 算子——以矩阵乘法为例 基于 Triton 的 Block Sparse A

Chen Kai 2 Jan 09, 2022
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li, Ali Zare, Nima Mesgarani We pres

Aaron (Yinghao) Li 282 Jan 01, 2023
Self-supervised learning (SSL) is a method of machine learning

Self-supervised learning (SSL) is a method of machine learning. It learns from unlabeled sample data. It can be regarded as an intermediate form between supervised and unsupervised learning.

Ashish Patel 4 May 26, 2022
Time Delayed NN implemented in pytorch

Pytorch Time Delayed NN Time Delayed NN implemented in PyTorch. Usage kernels = [(1, 25), (2, 50), (3, 75), (4, 100), (5, 125), (6, 150)] tdnn = TDNN

Daniil Gavrilov 79 Aug 04, 2022
Dark Finix: All in one hacking framework with almost 100 tools

Dark Finix - Hacking Framework. Dark Finix is a all in one hacking framework wit

Md. Nur habib 2 Feb 18, 2022
Artificial Intelligence playing minesweeper 🤖

AI playing Minesweeper ✨ Minesweeper is a single-player puzzle video game. The objective of the game is to clear a rectangular board containing hidden

Vaibhaw 8 Oct 17, 2022
Activity image-based video retrieval

Cross-modal-retrieval Our approach is focus on Activity Image-to-Video Retrieval (AIVR) task. The compared methods are state-of-the-art single modalit

BCMI 75 Oct 21, 2021