Generative Adversarial Text-to-Image Synthesis

Last update: Dec 31, 2022

Related tags

Deep Learning icml2016

Overview

###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee

This is the code for our ICML 2016 paper on text-to-image synthesis using conditional GANs. You can use it to train and sample from text-to-image models. The code is adapted from the excellent dcgan.torch.

####Setup Instructions

You will need to install Torch, CuDNN, and the display package.

####How to train a text to image model:

Download the birds and flowers and COCO caption data in Torch format.
Download the birds and flowers and COCO image data.
Download the text encoders for birds and flowers and COCO descriptions.
Modify the CONFIG file to point to your data and text encoder paths.
Run one of the training scripts, e.g. ./scripts/train_cub.sh

####How to generate samples:

For flowers: ./scripts/demo_flowers.sh. Add text descriptions to scripts/flowers_queries.txt.
For birds: ./scripts/demo_cub.sh.
For COCO (more general images): ./scripts/demo_coco.sh.
An html file will be generated with the results:

####Pretrained models:

####How to train a text encoder from scratch:

You may want to do this if you have your own new dataset of text descriptions.
For flowers and birds: follow the instructions here.
For MS-COCO: ./scripts/train_coco_txt.sh.

####Citation

If you find this useful, please cite our work as follows:

@inproceedings{reed2016generative,
  title={Generative Adversarial Text-to-Image Synthesis},
  author={Scott Reed and Zeynep Akata and Xinchen Yan and Lajanugen Logeswaran and Bernt Schiele and Honglak Lee},
  booktitle={Proceedings of The 33rd International Conference on Machine Learning},
  year={2016}
}

Generative Adversarial Text-to-Image Synthesis

Related tags

Overview

Owner

Scott Ellison Reed

This is the source code for the experiments related to the paper Unsupervised Audio Source Separation Using Differentiable Parametric Source Models

Python implementation of a live deep learning based age/gender/expression recognizer

mPose3D, a mmWave-based 3D human pose estimation model.

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Tweesent-back - Tweesent backend uses fastAPI as the web framework

Pytorch implementation of "Geometrically Adaptive Dictionary Attack on Face Recognition" (WACV 2022)

ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Tensorflow Tutorials using Jupyter Notebook

Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

Towards Implicit Text-Guided 3D Shape Generation (CVPR2022)

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Learning to See by Looking at Noise

Face Library is an open source package for accurate and real-time face detection and recognition

QilingLab challenge writeup

RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

Time series annotation library.