TensorFlow implementation of "Learning from Simulated and Unsupervised Images through Adversarial Training"

Last update: Dec 29, 2022

Overview

Simulated+Unsupervised (S+U) Learning in TensorFlow

TensorFlow implementation of Learning from Simulated and Unsupervised Images through Adversarial Training.

Requirements

Python 2.7
TensorFlow 0.12.1
SciPy
pillow
tqdm

Usage

To generate synthetic dataset:

Run UnityEyes with changing resolution to 640x480 and Camera parameters to [0, 0, 20, 40].
Move generated images and json files into data/gaze/UnityEyes.

The data directory should looks like:

data
├── gaze
│   ├── MPIIGaze
│   │   └── Data
│   │       └── Normalized
│   │           ├── p00
│   │           ├── p01
│   │           └── ...
│   └── UnityEyes # contains images of UnityEyes
│       ├── 1.jpg
│       ├── 1.json
│       ├── 2.jpg
│       ├── 2.json
│       └── ...
├── __init__.py
├── gaze_data.py
├── hand_data.py
└── utils.py

To train a model (samples will be generated in samples directory):

$ python main.py
$ tensorboard --logdir=logs --host=0.0.0.0

To refine all synthetic images with a pretrained model:

$ python main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/"

Training results

Differences with the paper

Used Adam and Stochatstic Gradient Descent optimizer.
Only used 83K (14% of 1.2M used by the paper) synthetic images from UnityEyes.
Manually choose hyperparameters for B and lambda because those are not specified in the paper.

Experiments #1

For these synthetic images,

Result of lambda=1.0 with optimizer=sgd after 8,000 steps.

$ python main.py --reg_scale=1.0 --optimizer=sgd

Result of lambda=0.5 with optimizer=sgd after 8,000 steps.

$ python main.py --reg_scale=0.5 --optimizer=sgd

Training loss of discriminator and refiner when lambda is 1.0 (green) and 0.5 (yellow).

Experiments #2

For these synthetic images,

Result of lambda=1.0 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=1.0 --optimizer=adam

Result of lambda=0.5 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=0.5 --optimizer=adam

Result of lambda=0.1 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=0.1 --optimizer=adam

Training loss of discriminator and refiner when lambda is 1.0 (blue), 0.5 (purple) and 0.1 (green).

Author

Taehoon Kim / @carpedm20

TensorFlow implementation of "Learning from Simulated and Unsupervised Images through Adversarial Training"

Related tags

Overview

Simulated+Unsupervised (S+U) Learning in TensorFlow

Requirements

Usage

Training results

Differences with the paper

Experiments #1

Experiments #2

Author

Owner

Taehoon Kim

a pytorch implementation of auto-punctuation learned character by character

DeepMind Alchemy task environment: a meta-reinforcement learning benchmark

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

PFFDTD is an open-source FDTD simulator for 3D room acoustics

Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

Depth image based mouse cursor visual haptic

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth [Paper]

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

The Noise Contrastive Estimation for softmax output written in Pytorch

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Learning Spatio-Temporal Transformer for Visual Tracking

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

PyTorch implementation of the Transformer in Post-LN (Post-LayerNorm) and Pre-LN (Pre-LayerNorm).

The mini-MusicNet dataset