[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Overview

Learning to Compose Visual Relations

This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations.

Demo

Image Generation Demo

Please use the following command to generate images on the CLEVR dataset. Please use --num_rels to control the input relational descriptions.

python demo.py --checkpoint_folder ./checkpoint --model_name clevr --output_folder ./ --dataset clevr \
--resume_iter best --batch_size 25 --num_steps 80 --num_rels 1 --data_folder ./data --mode generation
GIF Final Generated Image

Image Editing Demo

Please use the following command to edit images on the CLEVR dataset. Please use --num_rels to control the input relational descriptions.

python demo.py --checkpoint_folder ./checkpoint --model_name clevr --output_folder ./ --dataset clevr \
--resume_iter best --batch_size 25 --num_steps 80 --num_rels 1 --data_folder ./data --mode editing
Input Image GIF Final Edited Image

Training

Data Preparation

Please utilize the following data link to download the CLEVR data utilized in our experiments. Then place all data files under ./data folder. Downloads for additional datasets and precomputed feature files will be posted soon. Feel free to raise an issue if there is a particular dataset you would like to download.

Model Training

To train your own model, please run following command. Please use --dataset to train your model on different datasets, e.g. --dataset clevr.

python -u train.py --cond --dataset=${dataset} --exp=${dataset} --batch_size=10 --step_lr=300 \
--num_steps=60 --kl --gpus=1 --nodes=1 --filter_dim=128 --im_size=128 --self_attn \
--multiscale --norm --spec_norm --slurm --lr=1e-4 --cuda --replay_batch \
--numpy_data_path ./data/clevr_training_data.npz

Evaluation

To evaluate our model, you can use your own trained models or download the pre-trained models model_best.pth from ${dataset}_model folder from link and put it under the project folder ./checkpoints/${dataset}. Only clevr_model is currently available. More pretrained-models will be posted soon.

Evaluate Image Generation Results Using the Pretrained Classifiers

Please use the following command to generate images on the test set first. Please use --dataset and --num_rels to control the dataset and the number of input relational descriptions. Note that 1 <= num_rels <= 3.

python inference.py --checkpoint_folder ./checkpoints --model_name ${dataset} \
--output_folder ./${dataset}_gen_images --dataset ${dataset} --resume_iter best \
--batch_size 32 --num_steps 80 --num_rels ${num_rels} --data_folder ./data --mode generation

In order to evaluate the binary classification scores of the generated images, you can train one binary classifier or download a pretrained one from link under the binary_classifier folder.

To train your own binary classifier, please use following command:

python train_classifier.py --train --spec_norm --norm \
--dataset ${dataset} --lr 3e-4 --checkpoint_dir ./binary_classifier

Please use following command to evaluate on generated images conditioned on selected number of relations. Please use --num_rels to specify the number of relations.

python classification_scores.py --dataset ${dataset} --checkpoint_dir ./binary_classifier/ \
--data_folder ./data --generated_img_folder ./${dataset}_gen_images/num_rels_${num_rels} \
--mode generation --num_rels ${num_rels}

Evaluate Image Editing Results Using the Pretrained Classifiers

Please use the following command to edit images on the test set first. Please use --dataset and --num_rels to select the dataset and the number of input relational descriptions.

python inference.py --checkpoint_folder ./checkpoints --model_name ${dataset} \
--output_folder ./${dataset}_edit_images --dataset ${dataset} --resume_iter best \
--batch_size 32 --num_steps 80 --num_rels 1 --data_folder ./data --mode editing

To evaluate classification scores of image editing results, please change the --mode to editing.

python classification_scores.py --dataset ${dataset} --checkpoint_dir ./binary_classifier/ \
--data_folder ./data --generated_img_folder ./${dataset}_edit_images/num_rels_${num_rels} \
--mode editing --num_rels ${num_rels}

Acknowledgements

The code for training EBMs is from https://github.com/yilundu/improved_contrastive_divergence.


Citation

Please consider citing our papers if you use this code in your research:

@article{liu2021learning,
  title={Learning to Compose Visual Relations},
  author={Liu, Nan and Li, Shuang and Du, Yilun and Tenenbaum, Josh and Torralba, Antonio},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
Owner
Nan Liu
MS CS @uiuc; BS CS @umich
Nan Liu
Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

TwitterSuicideML Scripts for reproducing the Machine Learning analysis of the paper: Detecting Potentially Harmful and Protective Suicide-related Cont

3 Oct 17, 2022
Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

DeepMTA_PyTorch Officical PyTorch Implementation of "Dynamic Attention-guided Multi-TrajectoryAnalysis for Single Object Tracking", Xiao Wang, Zhe Che

Xiao Wang(王逍) 7 Dec 03, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Our code requires Python ≥ 3.8 Installation For example, venv + pip: $ python3 -m venv env $ source env/bin/activate (env) $ pyt

9 May 10, 2022
A library for uncertainty quantification based on PyTorch

Torchuq [logo here] TorchUQ is an extensive library for uncertainty quantification (UQ) based on pytorch. TorchUQ currently supports 10 representation

TorchUQ 96 Dec 12, 2022
Learning Confidence for Out-of-Distribution Detection in Neural Networks

Learning Confidence Estimates for Neural Networks This repository contains the code for the paper Learning Confidence for Out-of-Distribution Detectio

235 Jan 05, 2023
FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection arXi

59 Nov 29, 2022
clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

README clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation CVPR 2021 Authors: Suprosanna Shit and Johannes C. Paetzo

110 Dec 29, 2022
An example showing how to use jax to train resnet50 on multi-node multi-GPU

jax-multi-gpu-resnet50-example This repo shows how to use jax for multi-node multi-GPU training. The example is adapted from the resnet50 example in d

Yangzihao Wang 20 Jul 04, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 71 Dec 28, 2022
NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

NeuroFind A solution to the to the Task given by the Oberseminar of Messtechnik

1 Jan 20, 2022
Really awesome semantic segmentation

really-awesome-semantic-segmentation A list of all papers on Semantic Segmentation and the datasets they use. This site is maintained by Holger Caesar

Holger Caesar 400 Nov 28, 2022
Temporal Knowledge Graph Reasoning Triggered by Memories

MTDM Temporal Knowledge Graph Reasoning Triggered by Memories To alleviate the time dependence, we propose a memory-triggered decision-making (MTDM) n

4 Sep 25, 2022
PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

(pytorch) Gen-LaneNet: a generalized and scalable approach for 3D lane detection Introduction This is a pytorch implementation of Gen-LaneNet, which p

Yuliang Guo 233 Jan 06, 2023
[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Reliable Propagation-Correction Modulation for Video Object Segmentation (AAAI22) Preview version paper of this work is available at: https://arxiv.or

Xiaohao Xu 70 Dec 04, 2022
A curated list and survey of awesome Vision Transformers.

English | 简体中文 A curated list and survey of awesome Vision Transformers. You can use mind mapping software to open the mind mapping source file. You c

OpenMMLab 281 Dec 21, 2022
Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

A Unified Framework for Parameter-Efficient Transfer Learning This is the official implementation of the paper: Towards a Unified View of Parameter-Ef

Junxian He 216 Dec 29, 2022
This project aims to be a handler for input creation and running of multiple RICEWQ simulations.

What is autoRICEWQ? This project aims to be a handler for input creation and running of multiple RICEWQ simulations. What is RICEWQ? From the descript

Yass Fuentes 1 Feb 01, 2022
LinkNet - This repository contains our Torch7 implementation of the network developed by us at e-Lab.

LinkNet This repository contains our Torch7 implementation of the network developed by us at e-Lab. You can go to our blogpost or read the article Lin

e-Lab 158 Nov 11, 2022
Vision Transformer and MLP-Mixer Architectures

Vision Transformer and MLP-Mixer Architectures Update (2.7.2021): Added the "When Vision Transformers Outperform ResNets..." paper, and SAM (Sharpness

Google Research 6.4k Jan 04, 2023