FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Last update: Dec 31, 2022

Related tags

Deep Learning FuseDream

Overview

FuseDream

This repo contains code for our paper (paper link):

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

by Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su and Qiang Liu from UCSD and UT Austin.

Introduction

FuseDream uses pre-trained GANs (we support BigGAN-256 and BigGAN-512 for now) and CLIP to achieve high-fidelity text-to-image generation.

Requirements

Please use pip or conda to install the following packages: PyTorch==1.7.1, torchvision==0.8.2, lpips==0.1.4 and also the requirements from BigGAN.

Getting Started

We transformed the pre-trained weights of BigGAN from TFHub to PyTorch. To save your time, you can download the transformed BigGAN checkpoints from:

https://drive.google.com/drive/folders/1nJ3HmgYgeA9NZr-oU-enqbYeO7zBaANs?usp=sharing

Put the checkpoints into ./BigGAN_utils/weights/

Run the following command to generate images from text query:

python fusedream_generator.py --text 'YOUR TEXT' --seed YOUR_SEED

For example, to get an image of a blue dog:

python fusedream_generator.py --text 'A photo of a blue dog.' --seed 1234

The generated image will be stored in ./samples

Colab Notebook

For a quick test of FuseDream, we provide Colab notebooks for FuseDream(Single Image) and FuseDream-Composition(TODO). Have fun!

Citations

If you use the code, please cite:

@inproceedings{
brock2018large,
title={Large Scale {GAN} Training for High Fidelity Natural Image Synthesis},
author={Andrew Brock and Jeff Donahue and Karen Simonyan},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=B1xsqj09Fm},
}

and

@misc{
liu2021fusedream,
title={FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization}, 
author={Xingchao Liu and Chengyue Gong and Lemeng Wu and Shujian Zhang and Hao Su and Qiang Liu},
year={2021},
eprint={2112.01573},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Related tags

Overview

FuseDream

Introduction

Requirements

Getting Started

Colab Notebook

Citations

Owner

XCL

tree-math: mathematical operations for JAX pytrees

A Simple and Versatile Framework for Object Detection and Instance Recognition

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

Semi-automated OpenVINO benchmark_app with variable parameters

Pretrained Cost Model for Distributed Constraint Optimization Problems

Distilled coarse part of LoFTR adapted for compatibility with TensorRT and embedded divices

Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

Code for "Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance" at NeurIPS 2021

Evolutionary Scale Modeling (esm): Pretrained language models for proteins

A unified 3D Transformer Pipeline for visual synthesis

Source code for CVPR 2020 paper "Learning to Forget for Meta-Learning"

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

NDE: Climate Modeling with Neural Diffusion Equation, ICDM'21

A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.

ReferFormer - Official Implementation of ReferFormer

Lucid Sonic Dreams syncs GAN-generated visuals to music.

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come