SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

Last update: Dec 30, 2022

Related tags

Overview

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

[Paper] [Project Website]

Pytorch implementation for SAVI2I. We propose a simple yet effective signed attribute vector (SAV) that facilitates continuous translation on diverse mapping paths across multiple domains.
More video results please see Our Webpage
Contact: Qi Mao ([email protected])

Paper

Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors
Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Siwei Ma, and Ming-Hsuan Yang
In arXiv 2020

Citation

If you find this work useful for your research, please cite our paper:

    @article{mao2020continuous,
      author       = "Mao, Qi and Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Ma, Siwei and Yang, Ming-Hsuan",
      title        = "Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors",
      journal    = "arXiv preprint 2011.01215",
      year         = "2020"
    }

Quick Start

Prerequisites

Linux or Windows
Python 3+
Suggest to use two P100 16GB GPUs or One V100 32GB GPU.

Install

Clone this repo:

git clone https://github.com/HelenMao/SAVI2I.git
cd SAVI2I

This code requires Pytorch 0.4.0+ and Python 3+. Please install dependencies by

conda create -n SAVI2I python=3.6
source activate SAVI2I
pip install -r requirements.txt

Training Datasets

Download datasets for each task into the dataset folder

./datasets

Style translation: Yosemite (summer <-> winter) and Photo2Artwork (Photo, Monet, Van Gogh and Ukiyo-e)

You can follow the instructions of CycleGAN datasets to download Yosemite and Photo2artwork datasets.

Shape-variation translation: CelebA-HQ (Male <-> Female) and AFHQ (Cat, Dog and WildLife)

We split CelebA-HQ into male and female domains according to the annotated label and fine-tune the images manaully.

You can follow the instructions of StarGAN-v2 datasets to download CelebA-HQ and AFHQ datasets.

Training

Notes

For low-level style translation tasks, you suggest to set --type=1 to use corresponding network architectures.
For shape-variation translation tasks, you suggest to set --type=0 to use corresponding network architectures.

Yosemite

python train.py --dataroot ./datasets/Yosemite/ --phase train --type 1 --name Yosemite --n_ep 700 --n_ep_decay 500 --lambda_r1 10 --lambda_mmd 1 --num_domains 2

Photo2artwork

python train.py --dataroot ./datasets/Photo2artwork/ --phase train --type 1 --name Photo2artwork --n_ep 100 --n_ep_decay 0 --lambda_r1 10 --lambda_mmd 1 --num_domains 4

CelebAHQ

python train.py --dataroot ./datasets/CelebAHQ/ --phase train --type 0 --name CelebAHQ --n_ep 30 --n_ep_decay 0 --lambda_r1 1 --lambda_mmd 1 --num_domains 2

AFHQ

python train.py --dataroot ./datasets/AFHQ/ --phase train --type 0 --name AFHQ --n_ep 100 --n_ep_decay 0 --lambda_r1 1 --lambda_mmd 10 --num_domains 3

Pre-trained Models

Download and save them into

./models

or download the pre-trained models with the following script.

bash ./download_models.sh

Testing

Reference-guided

python test_reference_save.py --dataroot ./datasets/CelebAHQ --resume ./models/CelebAHQ/00029.pth --phase test --type 0 --num_domains 2 --index_s A --index_t B --num 5 --name CelebAHQ_ref

Latent-guided

python test_latent_rdm_save.py --dataroot ./datasets/CelebAHQ --resume ./models/CelebAHQ/00029.pth --phase test --type 0 --num_domains 2 --index_s A --index_t B --num 5 --name CelebAHQ_rdm

License

All rights reserved.
Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International).
The codes are only for academical research use. For commercial use, please contact [email protected].

Acknowledgements

Codes and network architectures inspired from:

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

Related tags

Overview

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

[Paper] [Project Website]

Paper

Citation

Quick Start

Prerequisites

Install

Training Datasets

Training

Notes

Pre-trained Models

Testing

License

Acknowledgements

Owner

Qi Mao

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework.

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

DAGAN - Dual Attention GANs for Semantic Image Synthesis

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

End-2-end speech synthesis with recurrent neural networks

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

novel deep learning research works with PaddlePaddle

Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

Simple bots or Simbots is a library designed to create simple bots using the power of python. This library utilises Intent, Entity, Relation and Context model to create bots .

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

a chinese segment base on crf

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

NLP project that works with news (NER, context generation, news trend analytics)

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.