Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Last update: Dec 23, 2022

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba
Paper | Website | arxiv

This repository contains code for finding layer-selective directions, distilling them, and loading the vocabulary of visual concepts in BigGAN used in the original paper.

Notice: This repository is under active development! Expect instability until at least October 25th, 2021.

Installation

The provided code has been tested for Python 3.8 on MacOS and Ubuntu 20.04. It may still work in other environments, but we make no guarantees.

To run the code yourself, start by cloning the repository:

git clone https://github.com/schwettmann/visual-vocab
cd visual-vocab

(Optional) You will probably want to create a conda environment or virtual environment instead of installing the dependencies globally. E.g., to create a new virtual environment you can run:

python3 -m venv env
source env/bin/activate

Finally, install the Python dependencies using pip:

pip3 install -r requirements.txt

Usage

Notice: This section is under construction and will be updated as functionality gets added.

To download any of the various annotated directions from the paper, use datasets.load submodule. It downloads and parses the annoated directions. Example usage:

from visualvocab import datasets

# Download layer-selective directions and annotations used for distilling single-word directions:
dataset = datasets.load('lsd_all')

# Download distilled directions for all BigGAN-Places365 categories:
dataset = datasets.load('distilled_all')

# Download distilled directions for a specific BigGAN-Places365 category:
dataset = datasets.load('distilled_cottage')

See the module for a full list of available annotated directions.

Citation

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba. Toward a Visual Concept Vocabulary for GAN Latent Space, Proceedings of the International Conference on Computer Vision (ICCV), 2021.

Bibtex

@InProceedings{Schwettmann_2021_ICCV,
    author    = {Schwettmann, Sarah and Hernandez, Evan and Bau, David and Klein, Samuel and Andreas, Jacob and Torralba, Antonio},
    title     = {Toward a Visual Concept Vocabulary for GAN Latent Space},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {6804-6812}
}

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}

Installation

Usage

Citation

Bibtex

Owner

Sarah Schwettmann

Binaural Speech Synthesis

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Implementation of paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa.

Repository of the Code to Chatbots, developed in Python

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

Code associated with the Don't Stop Pretraining ACL 2020 paper

Python generation script for BitBirds

Finally, some decent sample sentences

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

XLNet: Generalized Autoregressive Pretraining for Language Understanding

GooAQ 🥑 : Google Answers to Google Questions!

Chinese Pre-Trained Language Models (CPM-LM) Version-I

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

texlive expressions for documents

Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products

A Word Level Transformer layer based on PyTorch and 🤗 Transformers.

Labelling platform for text using distant supervision

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space Code and data from the ICCV 2021 paper

Installation

Usage

Citation

Bibtex

Owner

Sarah Schwettmann

Binaural Speech Synthesis

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Implementation of paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa.

Repository of the Code to Chatbots, developed in Python

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

Code associated with the Don't Stop Pretraining ACL 2020 paper

Python generation script for BitBirds

Finally, some decent sample sentences

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

XLNet: Generalized Autoregressive Pretraining for Language Understanding

GooAQ 🥑 : Google Answers to Google Questions!

Chinese Pre-Trained Language Models (CPM-LM) Version-I

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

texlive expressions for documents

Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products

A Word Level Transformer layer based on PyTorch and 🤗 Transformers.

Labelling platform for text using distant supervision

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}