🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

Overview

SGLKT-VisDial

Pytorch Implementation for the paper:

Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer
Gi-Cheon Kang, Junseok Park, Hwaran Lee, Byoung-Tak Zhang*, and Jin-Hwa Kim* (* corresponding authors)
In EMNLP 2021 Findings

Setup and Dependencies

This code is implemented using PyTorch v1.0+, and provides out of the box support with CUDA 9+ and CuDNN 7+. Anaconda/Miniconda is the recommended to set up this codebase:

  1. Install Anaconda or Miniconda distribution based on Python3+ from their downloads' site.
  2. Clone this repository and create an environment:
git clone https://www.github.com/gicheonkang/sglkt-visdial
conda create -n visdial-ch python=3.6

# activate the environment and install all dependencies
conda activate sglkt
cd sglkt-visdial/
pip install -r requirements.txt

# install this codebase as a package in development version
python setup.py develop

Download Data

  1. We used the Faster-RCNN pre-trained with Visual Genome as image features. Download the image features below, and put each feature under $PROJECT_ROOT/data/{SPLIT_NAME}_feature directory. We need image_id to RCNN bounding box index file ({SPLIT_NAME}_imgid2idx.pkl) because the number of bounding box per image is not fixed (ranging from 10 to 100).
  • train_btmup_f.hdf5: Bottom-up features of 10 to 100 proposals from images of train split (32GB).
  • val_btmup_f.hdf5: Bottom-up features of 10 to 100 proposals from images of validation split (0.5GB).
  • test_btmup_f.hdf5: Bottom-up features of 10 to 100 proposals from images of test split (2GB).
  1. Download the pre-trained, pre-processed word vectors from here (glove840b_init_300d.npy), and keep them under $PROJECT_ROOT/data/ directory. You can manually extract the vectors by executing data/init_glove.py.

  2. Download visual dialog dataset from here (visdial_1.0_train.json, visdial_1.0_val.json, visdial_1.0_test.json, and visdial_1.0_val_dense_annotations.json) under $PROJECT_ROOT/data/ directory.

  3. Download the additional data for Sparse Graph Learning and Knowledge Transfer under $PROJECT_ROOT/data/ directory.

Training

Train the model provided in this repository as:

python train.py --gpu-ids 0 1 # provide more ids for multi-GPU execution other args...

Saving model checkpoints

This script will save model checkpoints at every epoch as per path specified by --save-dirpath. Default path is $PROJECT_ROOT/checkpoints.

Evaluation

Evaluation of a trained model checkpoint can be done as follows:

python evaluate.py --load-pthpath /path/to/checkpoint.pth --split val --gpu-ids 0 1

Validation scores can be checked in offline setting. But if you want to check the test split score, you have to submit a json file to EvalAI online evaluation server. You can make json format with --save_ranks True option.

Pre-trained model & Results

We provide the pre-trained models for SGL+KT and SGL.
To reproduce the results reported in the paper, please run the command below.

python evaluate.py --load-pthpath SGL+KT.pth --split test --gpu-ids 0 1 --save-ranks True

Performance on v1.0 test-std (trained on v1.0 train):

Model Overall NDCG MRR [email protected] [email protected] [email protected] Mean
SGL+KT 65.31 72.60 58.01 46.20 71.01 83.20 5.85

Citation

If you use this code in your published research, please consider citing:

@article{kang2021reasoning,
  title={Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer},
  author={Kang, Gi-Cheon and Park, Junseok and Lee, Hwaran and Zhang, Byoung-Tak and Kim, Jin-Hwa},
  journal={arXiv preprint arXiv:2004.06698},
  year={2021}
}

License

MIT License

Acknowledgements

We use Visual Dialog Challenge Starter Code and MCAN-VQA as reference code.

Owner
Gi-Cheon Kang
Grounded language learning, visual dialog
Gi-Cheon Kang
Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

3D Infomax improves GNNs for Molecular Property Prediction Video | Paper We pre-train GNNs to understand the geometry of molecules given only their 2D

Hannes Stärk 95 Dec 30, 2022
The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

Elon Lin 41 Dec 15, 2022
This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin

SPARQLing Database Queries from Intermediate Question Decompositions This repo is the implementation of the following paper: SPARQLing Database Querie

Yandex Research 20 Dec 19, 2022
Source code and data in paper "MDFEND: Multi-domain Fake News Detection (CIKM'21)"

MDFEND: Multi-domain Fake News Detection This is an official implementation for MDFEND: Multi-domain Fake News Detection which has been accepted by CI

Rich 40 Dec 18, 2022
TimeSHAP explains Recurrent Neural Network predictions.

TimeSHAP TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes even

Feedzai 90 Dec 18, 2022
PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Code for On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models This repository will reproduce the main results from our pape

Mitch Hill 32 Nov 25, 2022
A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

rand-net 16 Oct 18, 2022
This game was designed to encourage young people not to gamble on lotteries, as the probablity of correctly guessing the number is infinitesimal!

Lottery Simulator 2022 for Web Launch Application Developed by John Seong in Ontario. This game was designed to encourage young people not to gamble o

John Seong 2 Sep 02, 2022
PyTorch implementation for View-Guided Point Cloud Completion

PyTorch implementation for View-Guided Point Cloud Completion

22 Jan 04, 2023
A PyTorch implementation of QANet.

QANet-pytorch NOTICE I'm very busy these months. I'll return to this repo in about 10 days. Introduction An implementation of QANet with PyTorch. Any

H. Z. 343 Nov 03, 2022
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

Google Cloud Platform 792 Dec 28, 2022
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

Sukrut Rao 32 Dec 13, 2022
[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。

QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,最短只需一行命令即可将普通的Python脚本打包成EXE可执行程序,并选择性添加CUDA和NoAVX的支持,尽可能兼容更多的用户环境。 感觉还可

QPT Family 545 Dec 28, 2022
SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.

FIW Data Development Kit Table of Contents Introduction Families In the Wild Database Publications Organization To Do License Getting Involved Introdu

Joseph P. Robinson 12 Jun 04, 2022
A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

jedibobo 3 Dec 28, 2022
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization This repository contains the code for the BBI optimizer, introduced in the p

G. Bruno De Luca 5 Sep 06, 2022
GAN example for Keras. Cuz MNIST is too small and there should be something more realistic.

Keras-GAN-Animeface-Character GAN example for Keras. Cuz MNIST is too small and there should an example on something more realistic. Some results Trai

160 Sep 20, 2022
Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

disclaimer: this code is modified from pytorch-tutorial Image classification with synthetic gradient in Pytorch I implement the Decoupled Neural Inter

Andrew 114 Dec 22, 2022