AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

AsymmetricGAN for Image-to-Image Translation

AsymmetricGAN Framework for Multi-Domain Image-to-Image Translation

UN_Framework

AsymmetricGAN Framework for Hand Gesture-to-Gesture Translation

SU_Framework

Conference paper | Extended paper | Project page | Slides | Poster

Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.
Hao Tang1, Dan Xu2, Wei Wang3, Yan Yan4 and Nicu Sebe1.
1University of Trento, Italy, 2University of Oxford, UK, 3EPFL, Switzerland, 4Texas State University, USA.
In ACCV 2018 (Oral).
The repository offers the official implementation of our paper in PyTorch.

License

Copyright (C) 2019 University of Trento, Italy.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use, please contact [email protected].

Installation

Clone this repo.

git clone https://github.com/Ha0Tang/AsymmetricGAN
cd AsymmetricGAN/

This code requires PyTorch 0.4.1 and python 3.6+. Please install dependencies by

pip install -r requirements.txt (for pip users)

or

./scripts/conda_deps.sh (for Conda users)

To reproduce the results reported in the paper, you would need two NVIDIA GeForce GTX 1080 Ti GPUs or two NVIDIA TITAN Xp GPUs.

Dataset Preparation

For hand gesture-to-gesture translation task, we use NTU Hand Digit and Creative Senz3D datasets. Both datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, follow GestureGAN to prepare both datasets. Please cite their papers if you use the data.

Preparing NTU Hand Digit Dataset. The dataset can be downloaded in this paper. After downloading it we adopt OpenPose to generate hand skeletons and use them as training and testing data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Creative Senz3D Dataset. The dataset can be downloaded here. After downloading it we adopt OpenPose to generate hand skeletons and use them as training data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Your Own Datasets. Each training sample in the dataset will contain {Ix,Iy,Cx,Cy}, where Ix=image x, Iy=image y, Cx=Controllable structure of image x, and Cy=Controllable structure of image y. Of course, you can use AsymmetricGAN for your own datasets and tasks.

Generating Images Using Pretrained Model

Once the dataset is ready. The result images can be generated using pretrained models.

  1. You can download a pretrained model (e.g. ntu_asymmetricgan) with the following script:
bash ./scripts/download_asymmetricgan_model.sh ntu_asymmetricgan

The pretrained model is saved at ./checkpoints/[type]_pretrained. Check here for all the available AsymmetricGAN models.

  1. Generate images using the pretrained model.

For NTU Dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D Dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

If you are running on CPU mode, change --gpu_ids 0 to --gpu_ids -1. Note that testing requires a lot of time and large amount of disk storage space. If you don't have enough space, append --saveDisk on the command line.

  1. The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Training New Models

New models can be trained with the following commands.

  1. Prepare dataset.

  2. Train.

For NTU dataset:

export CUDA_VISIBLE_DEVICES=3,4;
python train.py --dataroot ./datasets/ntu \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

For Senz3D dataset:

export CUDA_VISIBLE_DEVICES=5,7;
python train.py --dataroot ./datasets/senz3d \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use export CUDA_VISIBLE_DEVICES=[GPU_ID].

To view training results and loss plots on local computers, set --display_id to a non-zero value and run python -m visdom.server on a new terminal and click the URL http://localhost:8097. On a remote server, replace localhost with your server's name, such as http://server.trento.cs.edu:8097.

Can I continue/resume my training?

To fine-tune a pre-trained model, or resume the previous training, use the --continue_train --which_epoch --epoch_count flag. The program will then load the model based on epoch you set in --which_epoch . Set --epoch_count to specify a different starting epoch count.

Testing

Testing is similar to testing pretrained models.

For NTU dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

Use --how_many to specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • models/asymmetricgan_model.py: creates the networks, and compute the losses.
  • models/networks/: defines the architecture of all models for GestureGAN.
  • options/: creates option lists using argparse package.
  • data/: defines the class for loading images and controllable structures.

Evaluation Code

We use several metrics to evaluate the quality of the generated images:

To Do List

  • Upload supervised AsymmetricGAN code for hand gesture-to-gesture translation
  • Upload unsupervised AsymmetricGAN code for multi-domain image-to-image translation: code

Citation

If you use this code for your research, please cite our papers.

@article{tang2019asymmetric,
  title={Asymmetric Generative Adversarial Networks for Image-to-Image Translation},
  author={Hao Tang and Dan Xu and Hong Liu and Nicu Sebe},
  journal={arXiv preprint arXiv:1912.06931},
  year={2019}
}

@inproceedings{tang2018dual,
  title={Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation},
  author={Tang, Hao and Xu, Dan and Wang, Wei and Yan, Yan and Sebe, Nicu},
  booktitle={ACCV},
  year={2018}
}

Acknowledgments

This source code is inspired by Pix2pix and GestureGAN.

Related Projects

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

A selection of State Of The Art research papers (and code) on human trajectory prediction (forecasting). Papers marked with [W] are workshop papers.

Karttikeya Manglam 40 Nov 18, 2022
Code for How To Create A Fully Automated AI Based Trading System With Python

AI Based Trading System This code works as a boilerplate for an AI based trading system with yfinance as data source and RobinHood or Alpaca as broker

Rubén 196 Jan 05, 2023
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper

Emanuel Metzenthin 3 Dec 16, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
A set of tools for creating and testing machine learning features, with a scikit-learn compatible API

Feature Forge This library provides a set of tools that can be useful in many machine learning applications (classification, clustering, regression, e

Machinalis 380 Nov 05, 2022
Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

RHM: Robot Hacking Manual Download in PDF RHM v0.4 ┃ Read online The Robot Hacking Manual (RHM) is an introductory series about cybersecurity for robo

Víctor Mayoral Vilches 233 Dec 30, 2022
some classic model used to segment the medical images like CT、X-ray and so on

github_project This is a project for medical image segmentation. This project includes common medical image segmentation models such as U-net, FCN, De

2 Mar 30, 2022
Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch

Memory Efficient Attention This is unofficial implementation of Self-attention Does Not Need O(n^2) Memory for Jax and PyTorch. Implementation is almo

Amin Rezaei 126 Dec 27, 2022
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

320 Nov 21, 2022
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [pdf] The official repository for Self-Supervised Pre-Training for Transfo

Hao Luo 116 Jan 04, 2023
Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

[AI6122] Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course

HT. Li 5 Sep 12, 2022
PyTorch implementation of Deformable Convolution

Deformable Convolutional Networks in PyTorch This repo is an implementation of Deformable Convolution. Ported from author's MXNet implementation. Buil

411 Dec 16, 2022
The hippynn python package - a modular library for atomistic machine learning with pytorch.

The hippynn python package - a modular library for atomistic machine learning with pytorch. We aim to provide a powerful library for the training of a

Los Alamos National Laboratory 37 Dec 29, 2022
A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

Minimal Hand A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run. This project provides the

Yuxiao Zhou 824 Jan 07, 2023
Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking (SDR) via Contextualized Language Models and Hierarchical Inference This repo is the implementation for SD

Microsoft 36 Nov 28, 2022
Plenoxels: Radiance Fields without Neural Networks, Code release WIP

Plenoxels: Radiance Fields without Neural Networks Alex Yu*, Sara Fridovich-Keil*, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa UC Be

Alex Yu 2.3k Dec 30, 2022
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. The related paper is avai

26 Dec 13, 2022
vit for few-shot classification

Few-Shot ViT Requirements PyTorch (= 1.9) TorchVision timm (latest) einops tqdm numpy scikit-learn scipy argparse tensorboardx Pretrained Checkpoints

Martin Dong 26 Nov 30, 2022
🇰🇷 Text to Image in Korean

KoDALLE Utilizing pretrained language model’s token embedding layer and position embedding layer as DALLE’s text encoder. Background Training DALLE mo

HappyFace 74 Sep 22, 2022
A basic reminder tool written in Python.

A simple Python Reminder Here's a basic reminder tool written in Python that speaks to the user and sends a notification. Run pip3 install pyttsx3 w

Sachit Yadav 4 Feb 05, 2022