StarGAN2 for practice

Overview

StarGAN2 for practice

This version of StarGAN2 (coined as 'Post-modern Style Transfer') is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. At least, this is what I use nearly daily myself.
Here are few pieces, made with it: Terminal Blink, Occurro, etc.
Tested on Pytorch 1.4-1.8. Sequence-to-video conversions require FFMPEG. For more explicit details refer to the original implementation.

Features

  • streamlined workflow, focused on practical tasks [TBA]
  • cleaned up and simplified code for better readability
  • stricter memory management to fit bigger batches on consumer GPUs
  • models mixing (SWA) for better stability

NB: In the meantime here's only training code and some basic inference (processing). More various methods & use cases may be added later.

Presumed file structure

stargan2 root
├  _in input data for processing
├  _out generation output (sequences & videos)
├  data datasets for training
│  └  afhq [example] some dataset
│     ├  cats [example] images for training
│     │  └  test [example] images for validation
│     ├  dogs [example] images for training
│     │  └  test [example] images for validation
│     └  ⋯
├  models trained models for inference/processing
│  └  afhq-256-5-100.pkl [example] trained model file
├  src source code
└  train training folders
   └  afhq.. [example] auto-created training folder

Training

  • Prepare your multi-domain dataset as shown above. Main directory should contain folders with images of different domains (e.g. cats, dogs, ..); every such folder must contain test subfolder with validation subset. Such structure allows easy data recombination for experiments. The images may be of any sizes (they'll be randomly cropped during training), but not smaller than img_size specified for training (default is 256).

  • Train StarGAN2 on the prepared dataset (e.g. afhq):

 python src/train.py --data_dir data/afhq --model_dir train/afhq --img_size 256 --batch 8

This will run training process, according to the settings in src/train.py (check and explore those!). Models are saved under train/afhq and named as dataset-size-domaincount-kimgs, e.g. afhq-256-5-100.ckpt (required for resuming).

  • Resume training on the same dataset from the iteration 50 (thousands), presuming there's corresponding complete 3-models set (with nets and optims) in train/afhq:
 python src/train.py --data_dir data/afhq --model_dir train/afhq --img_size 256 --batch 8 --resume 50
  • Make an averaged model (only for generation) from the directory of those, e.g. train/select:
 python src/swa.py -i train/select 

Few personal findings

  1. Batch size is crucial for this network! Official settings are batch=8 for size 256, if you have large GPU RAM. One can fit batch 3 or 4 on 11gb GPU; those results are interesting, but less impressive. Batches of 2 or 1 are for the brave only.. Size is better kept as 256; the network has auto-scaling layer count, but I didn't manage to get comparable results for size 512 with batches up to 7 (max for 32gb).
  2. Model weights may seriously oscillate during training, especially for small batches (typical for Cycle- or Star- GANs), so it's better to save models frequently (there may be jewels). The best selected models can be mixed together with swa.py script for better stability. By default, Generator network is saved every 1000 iterations, and the full set - every 5000 iterations. 100k iterations (few days on a single GPU) may be enough; 200-250k would give pretty nice overfit.
  3. Lambda coefficients lambda_ds (diversity), lambda_cyc (reconstruction) and lambda_sty (style) may be increased for smaller batches, especially if the goal is stylization, rather than photo-realistic transformation. The videos above, for instance, were made with these lambdas equal 3. The reference-based generation is nearly lost with such settings, but latent-based one can make nice art.
  4. The order of domains in the training set matters a lot! I usually put some photos first (as it will be the main source imagery), and the closest to photoreal as second; but other approaches may go well too (and your mileage may vary).
  5. I particularly love this network for its' failures. Even the flawed results (when the batches are small, the lambdas are wrong, etc.) are usually highly expressive and "inventive", just the kind of "AI own art", which is so spoken about. Experimenting with such aesthetics is a great fun.

Generation

  • Transform image test.jpg with AFHQ model (can be downloaded here):
python src/test.py --source test.jpg --model models/100000_nets_ema.ckpt

This will produce 3 images (one per trained domain in the model) in the _out directory.
If source is a directory, every image in it will be processed accordingly.

  • Generate output for the domain(s), referenced by number(s):
python src/test.py --source test.jpg --model models/100000_nets_ema.ckpt --ref 2
  • Generate output with reference image for domain 1 (ref filename must start with that number):
python src/test.py --source test.jpg --model models/100000_nets_ema.ckpt --ref 1-ref.jpg

To be continued..

Credits

StarGAN2
Copyright © 2020, NAVER Corp. All rights reserved.
Made available under Creative Commons BY-NC 4.0 license.
Original paper: https://arxiv.org/abs/1912.01865

Owner
vadim epstein
vadim epstein
Kaggle | 9th place (part of) solution for the Bristol-Myers Squibb – Molecular Translation challenge

Part of the 9th place solution for the Bristol-Myers Squibb – Molecular Translation challenge translating images containing chemical structures into I

Erdene-Ochir Tuguldur 22 Nov 30, 2022
Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomaly Detection

Why, hello there! This is the supporting notebook for the research paper — Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomal

2 Dec 14, 2021
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

LETGAN How to Learn a Domain Adaptive Event Simulator? ACM MM 2021 Running Environment: pytorch=1.4, 1 NVIDIA-1080TI. More details can be found in pap

CVTEAM 4 Sep 20, 2022
Real-Time Semantic Segmentation in Mobile device

Real-Time Semantic Segmentation in Mobile device This project is an example project of semantic segmentation for mobile real-time app. The architectur

708 Jan 01, 2023
Automatic packaging of the open-composite libs for OvGME

OvGME Packager for OpenXR – OpenComposite for DCS Note This repository is currently unsupported and needs to be migrated to the upstream OpenComposite

12 Nov 03, 2022
Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

7 Jun 22, 2022
Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python

yolov5-opencv-cpp-python Example of performing inference with ultralytics YOLO V

183 Jan 09, 2023
API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

RL - EmsPy (work In Progress...) The EmsPy Python package was made to facilitate Reinforcement Learning (RL) algorithm research for developing and tes

20 Jan 05, 2023
Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Fisher Induced Sparse uncHanging (FISH) Mask This repo contains the code for Fisher Induced Sparse uncHanging (FISH) Mask training, from "Training Neu

Varun Nair 37 Dec 30, 2022
torchbearer: A model fitting library for PyTorch

Note: We're moving to PyTorch Lightning! Read about the move here. From the end of February, torchbearer will no longer be actively maintained. We'll

631 Jan 04, 2023
Datasets and pretrained Models for StyleGAN3 ...

Datasets and pretrained Models for StyleGAN3 ... Dear arfiticial friend, this is a collection of artistic datasets and models that we have put togethe

lucid layers 34 Oct 06, 2022
Learned image compression

Overview Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression. We first release the code for Variationa

Jiaheng Liu 163 Dec 04, 2022
Code for the paper "PortraitNet: Real-time portrait segmentation network for mobile device" @ CAD&Graphics2019

PortraitNet Code for the paper "PortraitNet: Real-time portrait segmentation network for mobile device". @ CAD&Graphics 2019 Introduction We propose a

265 Dec 01, 2022
PINN Burgers - 1D Burgers equation simulated by PINN

PINN(s): Physics-Informed Neural Network(s) for Burgers equation This is an impl

ShotaDEGUCHI 1 Feb 12, 2022
Process JSON files for neural recording sessions using Medtronic's BrainSense Percept PC neurostimulator

percept_processing This code processes JSON files for streamed neural data using Medtronic's Percept PC neurostimulator with BrainSense Technology for

Maria Olaru 3 Jun 06, 2022
Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks]

Neural Architecture Search for Spiking Neural Networks Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks] (https

Intelligent Computing Lab at Yale University 28 Nov 18, 2022
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 917 Jan 03, 2023
Spatial Action Maps for Mobile Manipulation (RSS 2020)

spatial-action-maps Update: Please see our new spatial-intention-maps repository, which extends this work to multi-agent settings. It contains many ne

Jimmy Wu 27 Nov 30, 2022
SANet: A Slice-Aware Network for Pulmonary Nodule Detection

SANet: A Slice-Aware Network for Pulmonary Nodule Detection This paper (SANet) has been accepted and early accessed in IEEE TPAMI 2021. This code and

Jie Mei 39 Dec 17, 2022