Aligning Latent and Image Spaces to Connect the Unconnectable

Last update: Jan 03, 2023

Related tags

Overview

About

This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model which can generate infinite images of diverse and complex scenes.

[Project page] [Paper]

Installation

To install, run the following command:

conda env create --file environment.yml --prefix ./env
conda activate ./env

Note: the tensorboard requirement is crucial, because otherwise upfirdn2d will not compile for some magical reason.

Training

To train the model, navigate to the project directory and run:

python infra/launch_local.py hydra.run.dir=. +experiment_name=my_experiment_name +dataset=dataset_name num_gpus=4

where dataset_name is the name of the dataset without .zip extension inside data/ directory (you can easily override the paths in configs/main.yml). So make sure that data/dataset_name.zip exists and should be a plain directory of images. See StyleGAN2-ADA repo for additional data format details. This training command will create an experiment inside experiments/ directory and will copy the project files into it. This is needed to isolate the code which produces the model.

Inference

The inference example can be found in notebooks/generate.ipynb

Data format

We use the same data format as the original StyleGAN2-ADA repo: it is a zip of images. It is assumed that all data is located in a single directory, specified in configs/main.yml. Put your datasets as zip archives into data/ directory.

Pretrained checkpoints

We provide checkpoints for the following datasets:

LHQ 1024x1024 with FID = 7.8. Note: this checkpoint has patch size of 1024x512, i.e. the image is generated in just 2 halves.

License

The project is based on the StyleGAN2-ADA repo developed by NVidia. I am not a lawyer, but I suppose that NVidia License applies to this project then.

Aligning Latent and Image Spaces to Connect the Unconnectable

Related tags

Overview

About

Installation

Training

Inference

Data format

Pretrained checkpoints

License

Owner

Ivan Skorokhodov

A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

Multi-objective gym environments for reinforcement learning.

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

Official implementation of Meta-StyleSpeech and StyleSpeech

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

StarGAN - Official PyTorch Implementation (CVPR 2018)

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Teaching end to end workflow of deep learning

Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

Compressed Video Action Recognition

👐OpenHands : Making Sign Language Recognition Accessible (WiP 🚧👷‍♂️🏗)

A Streamlit demo demonstrating the Deep Dream technique. Adapted from the TensorFlow Deep Dream tutorial.

Code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks