StyleGAN2-ada for practice

Overview

StyleGAN2-ada for practice

Open In Colab

This version of the newest PyTorch-based StyleGAN2-ada is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. Tested on Python 3.7 + PyTorch 1.7.1, requires FFMPEG for sequence-to-video conversions. For more explicit details refer to the original implementations.

Here is previous Tensorflow-based version, which produces compatible models (but not vice versa).
I still prefer it for few-shot training (~100 imgs), and for model surgery tricks (not ported here yet).

Features

  • inference (image generation) in arbitrary resolution (finally with proper padding on both TF and Torch)
  • multi-latent inference with split-frame or masked blending
  • non-square aspect ratio support (auto-picked from dataset; resolution must be divisible by 2**n, such as 512x256, 1280x768, etc.)
  • transparency (alpha channel) support (auto-picked from dataset)
  • using plain image subfolders as conditional datasets
  • funky "digression" inference technique, ported from Aydao

Few operation formats ::

  • Windows batch-files, described below (if you're on Windows with powerful GPU)
  • local Jupyter notebook (for non-Windows platforms)
  • Colab notebook (max ease of use, requires Google drive)

Just in case, original StyleGAN2-ada charms:

  • claimed to be up to 30% faster than original StyleGAN2
  • has greatly improved training (requires 10+ times fewer samples)
  • has lots of adjustable internal training settings
  • works with plain image folders or zip archives (instead of custom datasets)
  • should be easier to tweak/debug

Training

  • Put your images in data as subfolder or zip archive. Ensure they all have the same color channels (monochrome, RGB or RGBA).
    If needed, first crop square fragments from source video or directory with images (feasible method, if you work with patterns or shapes, rather than compostions):
 multicrop.bat source 512 256 

This will cut every source image (or video frame) into 512x512px fragments, overlapped with 256px shift by X and Y. Result will be in directory source-sub, rename it as you wish. If you edit the images yourself (e.g. for non-square aspect ratios), ensure their correct size. For conditional model split the data by subfolders (mydata/1, mydata/2, ..).

  • Train StyleGAN2-ada on the prepared dataset (image folder or zip archive):
 train.bat mydata

This will run training process, according to the settings in src/train.py (check and explore those!!). Results (models and samples) are saved under train directory, similar to original Nvidia approach. For conditional model add --cond option.

Please note: we save both compact models (containing only Gs network for inference) as -...pkl (e.g. mydata-512-0360.pkl), and full models (containing G/D/Gs networks for further training) as snapshot-...pkl. The naming is for convenience only.

Length of the training is defined by --lod_kimg X argument (training duration per layer/LOD). Network with base resolution 1024px will be trained for 20 such steps, for 512px - 18 steps, et cetera. Reasonable lod_kimg value for full training from scratch is 300-600, while for finetuning 20-40 is sufficient. One can override this approach, setting total duration directly with --kimg X.

If you have troubles with custom cuda ops, try removing their cached version (C:\Users\eps\AppData\Local\torch_extensions on Windows).

  • Resume training on mydata dataset from the last saved model at train/000-mydata-512-.. directory:
 train_resume.bat mydata 000-mydata-512-..
  • Uptrain (finetune) well-trained model ffhq-512.pkl on new data:
 train_resume.bat newdata ffhq-512.pkl

No need to count exact steps in this case, just stop when you're ok with the results (it's better to set low lod_kimg to follow the progress).

Generation

Generated results are saved as sequences and videos (by default, under _out directory).

  • Test the model in its native resolution:
 gen.bat ffhq-1024.pkl
  • Generate custom animation between random latent points (in z space):
 gen.bat ffhq-1024 1920-1080 100-20

This will load ffhq-1024.pkl from models directory and make a 1920x1080 px looped video of 100 frames, with interpolation step of 20 frames between keypoints. Please note: omitting .pkl extension would load custom network, effectively enabling arbitrary resolution, multi-latent blending, etc. Using filename with extension will load original network from PKL (useful to test foreign downloaded models). There are --cubic and --gauss options for animation smoothing, and few --scale_type choices. Add --save_lat option to save all traversed dlatent w points as Numpy array in *.npy file (useful for further curating).

  • Generate more various imagery:
 gen.bat ffhq-1024 3072-1024 100-20 -n 3-1

This will produce animated composition of 3 independent frames, blended together horizontally (similar to the image in the repo header). Argument --splitfine X controls boundary fineness (0 = smoothest).

Instead of simple frame splitting, one can load external mask(s) from b/w image file (or folder with file sequence):

 gen.bat ffhq-1024 1024-1024 100-20 --latmask _in/mask.jpg

Arguments --digress X would add some animated funky displacements with X strength (by tweaking initial const layer params). Arguments --trunc X controls truncation psi parameter, as usual.

NB: Windows batch-files support only 9 command arguments; if you need more options, you have to edit batch-file itself.

  • Project external images onto StyleGAN2 model dlatent points (in w space):
 project.bat ffhq-1024.pkl photo

The result (found dlatent points as Numpy arrays in *.npy files, and video/still previews) will be saved to _out/proj directory.

  • Generate smooth animation between saved dlatent points (in w space):
 play_dlatents.bat ffhq-1024 dlats 25 1920-1080

This will load saved dlatent points from _in/dlats and produce a smooth looped animation between them (with resolution 1920x1080 and interpolation step of 25 frames). dlats may be a file or a directory with *.npy or *.npz files. To select only few frames from a sequence somename.npy, create text file with comma-delimited frame numbers and save it as somename.txt in the same directory (check examples for FFHQ model). You can also "style" the result: setting --style_dlat blonde458.npy will load dlatent from blonde458.npy and apply it to higher layers, producing some visual similarity. --cubic smoothing and --digress X displacements are also applicable here.

  • Generate animation from saved point and feature directions (say, aging/smiling/etc for FFHQ model) in dlatent w space:
 play_vectors.bat ffhq-1024.pkl blonde458.npy vectors_ffhq

This will load base dlatent point from _in/blonde458.npy and move it along direction vectors from _in/vectors_ffhq, one by one. Result is saved as looped video.

Credits

StyleGAN2: Copyright © 2021, NVIDIA Corporation. All rights reserved.
Made available under the Nvidia Source Code License-NC
Original paper: https://arxiv.org/abs/2006.06676

Owner
vadim epstein
vadim epstein
A tensorflow model that predicts if the image is of a cat or of a dog.

Quick intro Hello and thank you for your interest in my project! This is the backend part of a two-repo application. The other part can be found here

Tudor Matei 0 Mar 08, 2022
This initial strategy was developed specifically for larger pools and is based on taking a moving average and deriving Bollinger Bands to create a projected active liquidity range.

Gamma's Strategy One This initial strategy was developed specifically for larger pools and is based on taking a moving average and deriving Bollinger

Gamma Strategies 46 Dec 02, 2022
Pytorch implementation of NEGEV method. Paper: "Negative Evidence Matters in Interpretable Histology Image Classification".

Pytorch 1.10.0 code for: Negative Evidence Matters in Interpretable Histology Image Classification (https://arxiv. org/abs/xxxx.xxxxx) Citation: @arti

Soufiane Belharbi 4 Dec 01, 2022
Source code of the paper Meta-learning with an Adaptive Task Scheduler.

ATS About Source code of the paper Meta-learning with an Adaptive Task Scheduler. If you find this repository useful in your research, please cite the

Huaxiu Yao 16 Dec 26, 2022
Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Feature-Style Encoder for Style-Based GAN Inversion Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion. Code will

InterDigital 63 Jan 03, 2023
Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

Erwin Quiring 163 Nov 21, 2022
This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Sergi Caelles 828 Jan 05, 2023
Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks Contributions A novel pairwise feature LSP to extract structural

31 Dec 06, 2022
Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Uncertainty-based OOD detection Exploring the link between uncertainty estimates obtained by "exact" Bayesian inference and out-of-distribution (OOD)

Christian Henning 1 Nov 05, 2022
Compare GAN code.

Compare GAN This repository offers TensorFlow implementations for many components related to Generative Adversarial Networks: losses (such non-saturat

Google 1.8k Jan 05, 2023
PyTorch implementations of neural network models for keyword spotting

Honk: CNNs for Keyword Spotting Honk is a PyTorch reimplementation of Google's TensorFlow convolutional neural networks for keyword spotting, which ac

Castorini 475 Dec 15, 2022
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

2021-CVPR-MvCLN This repo contains the code and data of the following paper accepted by CVPR 2021 Partially View-aligned Representation Learning with

XLearning Group 33 Nov 01, 2022
Exploration-Exploitation Dilemma Solving Methods

Exploration-Exploitation Dilemma Solving Methods Medium article for this repo - HERE In ths repo I implemented two techniques for tackling mentioned t

Aman Mishra 6 Jan 25, 2022
Time-Optimal Planning for Quadrotor Waypoint Flight

Time-Optimal Planning for Quadrotor Waypoint Flight This is an example implementation of the paper "Time-Optimal Planning for Quadrotor Waypoint Fligh

Robotics and Perception Group 38 Dec 02, 2022
Character Controllers using Motion VAEs

Character Controllers using Motion VAEs This repo is the codebase for the SIGGRAPH 2020 paper with the title above. Please find the paper and demo at

Electronic Arts 165 Jan 03, 2023
PyTorch implementation of TSception V2 using DEAP dataset

TSception This is the PyTorch implementation of TSception V2 using DEAP dataset in our paper: Yi Ding, Neethu Robinson, Su Zhang, Qiuhao Zeng, Cuntai

Yi Ding 27 Dec 15, 2022
A 3D sparse LBM solver implemented using Taichi

taichi_LBM3D Background Taichi_LBM3D is a 3D lattice Boltzmann solver with Multi-Relaxation-Time collision scheme and sparse storage structure impleme

Jianhui Yang 121 Jan 06, 2023
Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol.

Updated Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol. Introduction This balenaCloud (previously

Remko 1 Oct 17, 2021
Bringing Computer Vision and Flutter together , to build an awesome app !!

Bringing Computer Vision and Flutter together , to build an awesome app !! Explore the Directories Flutter · Machine Learning Table of Contents About

Padmanabha Banerjee 14 Apr 07, 2022