Generative Adversarial Text to Image Synthesis

Overview

Text To Image Synthesis

This is a tensorflow implementation of synthesizing images. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow.

Plese star https://github.com/tensorlayer/tensorlayer

Model architecture

Image Source : Generative Adversarial Text-to-Image Synthesis Paper

Requirements

Datasets

  • The model is currently trained on the flowers dataset. Download the images from here and save them in 102flowers/102flowers/*.jpg. Also download the captions from this link. Extract the archive, copy the text_c10 folder and paste it in 102flowers/text_c10/class_*.

N.B You can downloads all data files needed manually or simply run the downloads.py and put the correct files to the right directories.

python downloads.py

Codes

  • downloads.py download Oxford-102 flower dataset and caption files(run this first).
  • data_loader.py load data for further processing.
  • train_txt2im.py train a text to image model.
  • utils.py helper functions.
  • model.py models.

References

Results

  • the flower shown has yellow anther red pistil and bright red petals.
  • this flower has petals that are yellow, white and purple and has dark lines
  • the petals on this flower are white with a yellow center
  • this flower has a lot of small round pink petals.
  • this flower is orange in color, and has petals that are ruffled and rounded.
  • the flower has yellow petals and the center of it is brown
  • this flower has petals that are blue and white.
  • these white flowers have petals that start off white in color and end in a white towards the tips.

License

Apache 2.0

Comments
  • ValueError: Object arrays cannot be loaded when allow_pickle=False

    ValueError: Object arrays cannot be loaded when allow_pickle=False

    File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "/home/siddanath/importantforprojects/text-to-image/utils.py", line 20, in load_and_assign_npz params = tl.files.load_npz(name=name) File "/home/siddanath/importantforprojects/text-to-image/tensorlayer/files.py", line 600, in load_npz return d['params'] File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 262, in getitem pickle_kwargs=self.pickle_kwargs) File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py", line 722, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False

    opened by Siddanth-pai 2
  • Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    I got a problem, how can I solve it?

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnnftxt/rnn/dynamic/rnn/basic_lstm_cell'; and the cell was not constructed as BasicLSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

    opened by flsd201983 1
  • Next step after download.py

    Next step after download.py

    What is the next step to do after download.py? I tried python data_loader.py, but it has FileNotFoundError: FileNotFoundError: [Errno 2] No such file or directory: '/home/ly/src/lib/text-to-image/102flowers/text_c10'

    opened by arisliang 0
  • ValueError: invalid literal for int() with base 10: 'e' - when making inference

    ValueError: invalid literal for int() with base 10: 'e' - when making inference

    code -

    sample_sentence = ["a"] * int(sample_size/ni) + ["e"] * int(sample_size/ni) + ["i"] * int(sample_size/ni) + ["o"] * int(sample_size/ni) + ["u"] * int(sample_size/ni)

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize( sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')
    
    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={
        t_real_caption: sample_sentence,
        t_z: sample_seed})
    
    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')
    
    opened by Akinleyejoshua 0
  • Excuse me, why is the flower dataset I test the result is very different from result.png

    Excuse me, why is the flower dataset I test the result is very different from result.png

    import tensorflow as tf import tensorlayer as tl from tensorlayer.layers import * from tensorlayer.prepro import * from tensorlayer.cost import * import numpy as np import scipy from scipy.io import loadmat import time, os, re, nltk

    from utils import * from model import * import model import pickle

    ###======================== PREPARE DATA ====================================### print("Loading data from pickle ...") import pickle with open("_vocab.pickle", 'rb') as f: vocab = pickle.load(f) with open("_image_train.pickle", 'rb') as f: _, images_train = pickle.load(f) with open("_image_test.pickle", 'rb') as f: _, images_test = pickle.load(f) with open("_n.pickle", 'rb') as f: n_captions_train, n_captions_test, n_captions_per_image, n_images_train, n_images_test = pickle.load(f) with open("_caption.pickle", 'rb') as f: captions_ids_train, captions_ids_test = pickle.load(f)

    images_train_256 = np.array(images_train_256)

    images_test_256 = np.array(images_test_256)

    images_train = np.array(images_train) images_test = np.array(images_test)

    ni = int(np.ceil(np.sqrt(batch_size))) save_dir = "checkpoint"

    t_real_image = tf.placeholder('float32', [batch_size, image_size, image_size, 3], name = 'real_image')

    t_real_caption = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name='real_caption_input')

    t_z = tf.placeholder(tf.float32, [batch_size, z_dim], name='z_noise') generator_txt2img = model.generator_txt2img_resnet

    net_rnn = rnn_embed(t_real_caption, is_train=False, reuse=False) net_g, _ = generator_txt2img(t_z, net_rnn.outputs, is_train=False, reuse=False, batch_size=batch_size)

    sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) tl.layers.initialize_global_variables(sess)

    net_rnn_name = os.path.join(save_dir, 'net_rnn.npz400.npz') net_cnn_name = os.path.join(save_dir, 'net_cnn.npz400.npz') net_g_name = os.path.join(save_dir, 'net_g.npz400.npz') net_d_name = os.path.join(save_dir, 'net_d.npz400.npz')

    net_rnn_res = tl.files.load_and_assign_npz(sess=sess, name=net_rnn_name, network=net_rnn)

    net_g_res = tl.files.load_and_assign_npz(sess=sess, name=net_g_name, network=net_g)

    sample_size = batch_size sample_seed = np.random.normal(loc=0.0, scale=1.0, size=(sample_size, z_dim)).astype(np.float32)

    n = int(sample_size / ni) sample_sentence = ["the flower shown has yellow anther red pistil and bright red petals."] * n +
    ["this flower has petals that are yellow, white and purple and has dark lines"] * n +
    ["the petals on this flower are white with a yellow center"] * n +
    ["this flower has a lot of small round pink petals."] * n +
    ["this flower is orange in color, and has petals that are ruffled and rounded."] * n +
    ["the flower has yellow petals and the center of it is brown."] * n +
    ["this flower has petals that are blue and white."] * n +
    ["these white flowers have petals that start off white in color and end in a white towards the tips."] * n

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize(sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')

    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={ t_real_caption : sample_sentence, t_z : sample_seed})

    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')

    opened by keqkeq 0
  • Tensorflow 2.1, Tensorlayer 2.2 update

    Tensorflow 2.1, Tensorlayer 2.2 update

    Hello,

    are there any plans in the near future to update this git to the latest Tensorflow and Tensorlayer versions? I've been trying making the code run with backwards compat (compat.tf1. ...) but I've keep bumping on errors which are a bit too big of mouth full for me.

    Fyi: I've succesfully run the DCGAN Tensorlayer implementation with Tensorlayer 2.2 and a self build Tensorflow 2.1 (with 3.0 compute compatibility) from source in Python 3.7.

    So, an update would be greatly appreciated!

    opened by SadRebel1000 0
Releases(0.2)
Owner
Hao
Assistant Professor @ Peking University
Hao
Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

KML: A Machine Learning Framework for Operating Systems & Storage Systems Storage systems and their OS components are designed to accommodate a wide v

File systems and Storage Lab (FSL) 186 Nov 24, 2022
Ganilla - Official Pytorch implementation of GANILLA

GANILLA We provide PyTorch implementation for: GANILLA: Generative Adversarial Networks for Image to Illustration Translation. Paper Arxiv Updates (Fe

Samet Hi 462 Dec 05, 2022
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Progressive Transformers for End-to-End Sign Language Production Source code for "Progressive Transformers for End-to-End Sign Language Production" (B

58 Dec 21, 2022
Pydantic models for pywttr and aiopywttr.

Pydantic models for pywttr and aiopywttr.

Almaz 2 Dec 08, 2022
Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning.

Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning Installation

Pytorch Lightning 1.6k Jan 08, 2023
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Jan 01, 2023
Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

Kim Seonghyeon 502 Jan 03, 2023
Contextual Attention Network: Transformer Meets U-Net

Contextual Attention Network: Transformer Meets U-Net Contexual attention network for medical image segmentation with state of the art results on skin

Reza Azad 67 Nov 28, 2022
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

MIXIAOXIN_HO 3 Nov 11, 2022
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features

CleanRL (Clean Implementation of RL Algorithms) CleanRL is a Deep Reinforcement Learning library that provides high-quality single-file implementation

Costa Huang 1.8k Jan 01, 2023
Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

Jack Parker-Holder 22 Nov 16, 2022
A script helps the user to update Linux and Mac systems through the terminal

Description This script helps the user to update Linux and Mac systems through the terminal. All the user has to install some requirements and then ru

Roxcoder 2 Jan 23, 2022
PRIME: A Few Primitives Can Boost Robustness to Common Corruptions

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions This is the official repository of PRIME, the data agumentation method introduced i

Apostolos Modas 34 Oct 30, 2022
This repo is customed for VisDrone.

Object Detection for VisDrone(无人机航拍图像目标检测) My environment 1、Windows10 (Linux available) 2、tensorflow = 1.12.0 3、python3.6 (anaconda) 4、cv2 5、ensemble

53 Jul 17, 2022
Jax/Flax implementation of Variational-DiffWave.

jax-variational-diffwave Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.) DiffWave with

YoungJoong Kim 37 Dec 16, 2022
Referring Video Object Segmentation

Awesome-Referring-Video-Object-Segmentation Welcome to starts ⭐ & comments 💹 & sharing 😀 !! - 2021.12.12: Recent papers (from 2021) - welcome to ad

Explorer 57 Dec 11, 2022
​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

PAMA This is the Pytorch implementation of Progressive Attentional Manifold Alignment. Requirements python 3.6 pytorch 1.2.0+ PIL, numpy, matplotlib C

98 Nov 15, 2022
Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

The Apache Software Foundation 34.7k Jan 04, 2023
Neural Tangent Generalization Attacks (NTGA)

Neural Tangent Generalization Attacks (NTGA) ICML 2021 Video | Paper | Quickstart | Results | Unlearnable Datasets | Competitions | Citation Overview

Chia-Hung Yuan 34 Nov 25, 2022
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

Yumo Xu 28 Nov 10, 2022