A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

Related tags

Deep LearningIconQA
Overview

IconQA

License: CC BY-SA 4.0

About

IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and comprehensive cognitive reasoning in real-world problems.

iconqa examples

There are three different sub-tasks in IconQA:

  • 57,672 image choice MC questions
  • 31,578 text chioce MC questions
  • 18,189 fill-in-the-blank questions
Sub-Tasks Train Validation Test Total
Multi-image-choice 34,603 11,535 11,535 57,672
Multi-text-choice 18,946 6,316 6,316 31,578
Filling-in-the-blank 10,913 3,638 3,638 18,189

In addition to IconQA, we also present Icon645, a large-scale dataset of icons that cover a wide range of objects:

  • 645,687 colored icons
  • 377 different icon classes

icon_examples

For more details, you can find our website here and our paper here.

Download

Our dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please read the license before you use, change, or share our dataset.

You can download IconQA here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/iconqa.zip
unzip iconqa.zip

You can download Icon645 here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/icon645.zip
unzip icon645.zip

File structures for the IconQA dataset:

IconQA
|   LICENSE.md
|   metadata.json
|   pid2skills.json
|   pid_splits.json
|   problems.json
|   skills.json
└───test
│   │
│   └───choose_img
│   |   |
│   |   └───question_id
│   |   |   |   image.png
|   |   |   |   data.json
|   |   |   |   choice_0.png
|   |   |   |   choice_1.png
|   |   |   |   ...
|   |   |
|   |   └───question_id
|   |   |   ...
|   |   
|   └───choose_txt
|   |   |  
|   |   └───question_id
|   |   |   |   image.png
|   |   |   |   data.json
|   |   | 
|   |   └───question_id
|   |   |   ...
|   |
|   └───fill_in_blank
|       |  
|       └───question_id
|       |   |   image.png
|       |   |   data.json
|       | 
|       └───question_id
|       |   ...
|   
└───train
|   |   same as test
|   
└───val
    |   same as test

File structures for the Icon645 dataset:

Icon645
|   LICENCE.md
|   metadata.json
└───colored_icons_final
    |
    └───acorn
    |   |   image_id1.png
    |   |   image_id2.png
    |   |   ...
    |   
    └───airplane
    |   |   image_id3.png
    |   |   ...
    |      
    |   ...

Citation

If the paper or the dataset inspires you, please cite us:

@inproceedings{lu2021iconqa,
  title = {IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning},
  author = {Lu, Pan and Qiu, Liang and Chen, Jiaqi and Xia, Tony and Zhao, Yizhou and Zhang, Wei and Yu, Zhou and Liang, Xiaodan and Zhu, Song-Chun},
  booktitle = {Submitted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks},
  year = {2021}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Owner
Pan Lu
Computer Science
Pan Lu
UFT - Universal File Transfer With Python

UFT 2.0.0 UFT (Universal File Transfer) is a CLI tool , which can be used to upl

Merwin 1 Feb 18, 2022
PyTorch DepthNet Training on Still Box dataset

DepthNet training on Still Box Project page This code can replicate the results of our paper that was published in UAVg-17. If you use this repo in yo

Clément Pinard 115 Nov 21, 2022
Totally Versatile Miscellanea for Pytorch

Totally Versatile Miscellania for PyTorch Thomas Viehmann [email protected] Thi

Thomas Viehmann 428 Dec 28, 2022
This package is for running the semantic SLAM algorithm using extracted planar surfaces from the received detection

Semantic SLAM This package can perform optimization of pose estimated from VO/VIO methods which tend to drift over time. It uses planar surfaces extra

Hriday Bavle 125 Dec 02, 2022
E-RAFT: Dense Optical Flow from Event Cameras

E-RAFT: Dense Optical Flow from Event Cameras This is the code for the paper E-RAFT: Dense Optical Flow from Event Cameras by Mathias Gehrig, Mario Mi

Robotics and Perception Group 71 Dec 12, 2022
Bayesian Neural Networks in PyTorch

We present the new scheme to compute Monte Carlo estimator in Bayesian VI settings with almost no memory cost in GPU, regardles of the number of sampl

Jurijs Nazarovs 7 May 03, 2022
Simple STAC Catalogs discovery tool.

STAC Catalog Discovery Simple STAC discovery tool. Just paste the STAC Catalog link and press Enter. Details STAC Discovery tool enables discovering d

Mykola Kozyr 21 Oct 19, 2022
Implementation of GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation (ICLR 2022).

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation [OpenReview] [arXiv] [Code] The official implementation of GeoDiff: A Geome

Minkai Xu 155 Dec 26, 2022
Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch

Who has never met a forged picture on the web ? No one ! Everyday we are constantly facing fake pictures touched up in Photoshop but it is not always

Rony Abecidan 77 Dec 16, 2022
GitHub repository for "Improving Video Generation for Multi-functional Applications"

Improving Video Generation for Multi-functional Applications GitHub repository for "Improving Video Generation for Multi-functional Applications" Pape

Bernhard Kratzwald 328 Dec 07, 2022
FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data

FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data, a relatively complete set of integrated multi-source data download terminal software fast is developed. The softw

ChangChuntao 23 Dec 31, 2022
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

TriBERT This repository contains the code for the NeurIPS 2021 paper titled "TriBERT: Full-body Human-centric Audio-visual Representation Learning for

UBC Computer Vision Group 8 Aug 31, 2022
Flexible Option Learning - NeurIPS 2021

Flexible Option Learning This repository contains code for the paper Flexible Option Learning presented as a Spotlight at NeurIPS 2021. The implementa

Martin Klissarov 7 Nov 09, 2022
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

2 Nov 13, 2021
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training The Unreasonable Effectiveness of

VITA 44 Dec 23, 2022
The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral) MM | ArXiv This repository implements the paper "Text-Guided Neural Image Inpainting" by L

LisaiZhang 75 Dec 22, 2022
Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Lossy Compression for Lossless Prediction Using: Training: This repostiory contains our implementation of the paper: Lossy Compression for Lossless Pr

Yann Dubois 84 Jan 02, 2023
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
This is a library for training and applying sparse fine-tunings with torch and transformers.

This is a library for training and applying sparse fine-tunings with torch and transformers. Please refer to our paper Composable Sparse Fine-Tuning f

Cambridge Language Technology Lab 37 Dec 30, 2022
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection tool

yuxzho 94 Dec 25, 2022