Variational autoencoder for anime face reconstruction

Overview

VAE animeface

Variational autoencoder for anime face reconstruction

Introduction

This repository is an exploratory example to train a variational autoencoder to extract meaningful feature representations of anime girl face images.

The code architecture is mostly borrowed and modified from Yann Dubois's disentangling-vae repository. It has nice summarization and comparison of the different VAE model proposed recently.

Dataset

Anime Face Dataset contains 63,632 anime faces. (all rescaled to 64x64 in training)

https://raw.githubusercontent.com/Mckinsey666/Anime-Face-Dataset/master/test.jpg

Model

The model used is the one proposed in the paper Understanding disentangling in β-VAE, which is summarized below:

https://github.com/YannDubs/disentangling-vae/raw/master/doc/imgs/architecture.png

I used laplace as the target distribution to calculate the reconstruction loss. From Yann's code, it suggests that bernoulli would generally a better choice, but it looks it converge slowly in my case. (I didn't do a fair comparison to be conclusive)

Loss function used is β-VAEH from β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.

Result

Latent feature number is set to 20 (10 gaussian mean, 10 log gaussian variance). VAE model is trained for 100 epochs. All data is used for training, no validation and testing applied.

Face reconstruction

results/laplace_betaH_loss/test1_recons.png

results/laplace_betaH_loss/test2_recons.png

results/laplace_betaH_loss/test3_recons.png

Prior space traversal

Based on the face reconstruction result while traversing across the latent space, we may speculate the generative property of each latent as following:

  1. Hair shade
  2. Hair length
  3. Face orientation
  4. Hair color
  5. Face rotation
  6. Bangs, face color
  7. Hair glossiness
  8. Unclear
  9. Eye size & color
  10. Bangs

results/laplace_betaH_loss/test_prior_traversals.png

Original faces clustering

Original anime faces are clustered based on latent features (selected feature is either below 1% (left 5) or above 99% (right 5) among all data points, while the rest latent features are closeto each other). Visulization of the original images mostly confirms the speculation above.

results/laplace_betaH_loss/test_original_traversals.png

Latent feature diagnosis

Learned latent features are all close to standard normal distribution, and show minimum correlation.

results/laplace_betaH_loss/latent_diagnosis.png

Owner
Minzhe Zhang
Graduate student in UT Southwestern Medical Center. Bioinformatician. Computational biologist.
Minzhe Zhang
PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration (NeurIPS 2021) PyTorch implementation of the paper: CoFiNet: Reli

76 Jan 03, 2023
A MatConvNet-based implementation of the Fully-Convolutional Networks for image segmentation

MatConvNet implementation of the FCN models for semantic segmentation This package contains an implementation of the FCN models (training and evaluati

VLFeat.org 175 Feb 18, 2022
[ICLR 2022] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

DAB-DETR This is the official pytorch implementation of our ICLR 2022 paper DAB-DETR. Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi

336 Dec 25, 2022
Machine learning framework for both deep learning and traditional algorithms

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for

NeoML 704 Dec 27, 2022
Python implementation of "Single Image Haze Removal Using Dark Channel Prior"

##Dependencies pillow(~2.6.0) Numpy(~1.9.0) If the scripts throw AttributeError: __float__, make sure your pillow has jpeg support e.g. try: $ sudo ap

Joyee Cheung 73 Dec 20, 2022
PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)

PointCNN: Convolution On X-Transformed Points Created by Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. Introduction PointCNN

Yangyan Li 1.3k Dec 21, 2022
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
Normal Learning in Videos with Attention Prototype Network

Codes_APN Official codes of CVPR21 paper: Normal Learning in Videos with Attention Prototype Network (https://arxiv.org/abs/2108.11055) Overview of ou

11 Dec 13, 2022
Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees Implementation of the paper titled "Using Sampling to

MIDAS, IIIT Delhi 2 Aug 29, 2022
PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.

PyAF (Python Automatic Forecasting) PyAF is an Open Source Python library for Automatic Forecasting built on top of popular data science python module

CARME Antoine 405 Jan 02, 2023
Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

Hierarchical Attention Mining (HAM) for weakly-supervised abnormality localization This is the official PyTorch implementation for the HAM method. Pap

Xi Ouyang 22 Jan 02, 2023
Codes for the ICCV'21 paper "FREE: Feature Refinement for Generalized Zero-Shot Learning"

FREE This repository contains the reference code for the paper "FREE: Feature Refinement for Generalized Zero-Shot Learning". [arXiv][Paper] 1. Prepar

Shiming Chen 28 Jul 29, 2022
MANO hand model porting for the GraspIt simulator

Learning Joint Reconstruction of Hands and Manipulated Objects - ManoGrasp Porting the MANO hand model to GraspIt! simulator Yana Hasson, Gül Varol, D

Lucas Wohlhart 10 Feb 08, 2022
PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning"

deepGCFX PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning" Pr

Thilini Cooray 4 Aug 11, 2022
Python module providing a framework to trace individual edges in an image using Gaussian process regression.

Edge Tracing using Gaussian Process Regression Repository storing python module which implements a framework to trace individual edges in an image usi

Jamie Burke 7 Dec 27, 2022
Code release for ICCV 2021 paper "Anticipative Video Transformer"

Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page

Facebook Research 123 Dec 13, 2022
A package for "Procedural Content Generation via Reinforcement Learning" OpenAI Gym interface.

Readme: Illuminating Diverse Neural Cellular Automata for Level Generation This is the codebase used to generate the results presented in the paper av

Sam Earle 27 Jan 05, 2023
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

CSAW-M This repository contains code for CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer. Source code for tr

Yue Liu 7 Oct 11, 2022
Code and results accompanying our paper titled Mixture Proportion Estimation and PU Learning: A Modern Approach at Neurips 2021 (Spotlight)

Mixture Proportion Estimation and PU Learning: A Modern Approach This repository is the official implementation of Mixture Proportion Estimation and P

Approximately Correct Machine Intelligence (ACMI) Lab 23 Dec 28, 2022
A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers.

Dying Light 2 PAKFile Utility A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers. This tool aims to make PAKFile (.pak files) modding a

RHQ Online 12 Aug 26, 2022