A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

Last update: Dec 30, 2022

Related tags

Overview

IconQA

About

IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and comprehensive cognitive reasoning in real-world problems.

There are three different sub-tasks in IconQA:

57,672 image choice MC questions
31,578 text chioce MC questions
18,189 fill-in-the-blank questions

Sub-Tasks	Train	Validation	Test	Total
Multi-image-choice	34,603	11,535	11,535	57,672
Multi-text-choice	18,946	6,316	6,316	31,578
Filling-in-the-blank	10,913	3,638	3,638	18,189

In addition to IconQA, we also present Icon645, a large-scale dataset of icons that cover a wide range of objects:

645,687 colored icons
377 different icon classes

For more details, you can find our website here and our paper here.

Download

Our dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please read the license before you use, change, or share our dataset.

You can download IconQA here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/iconqa.zip
unzip iconqa.zip

You can download Icon645 here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/icon645.zip
unzip icon645.zip

File structures for the IconQA dataset:

IconQA
|   LICENSE.md
|   metadata.json
|   pid2skills.json
|   pid_splits.json
|   problems.json
|   skills.json
└───test
│   │
│   └───choose_img
│   |   |
│   |   └───question_id
│   |   |   |   image.png
|   |   |   |   data.json
|   |   |   |   choice_0.png
|   |   |   |   choice_1.png
|   |   |   |   ...
|   |   |
|   |   └───question_id
|   |   |   ...
|   |   
|   └───choose_txt
|   |   |  
|   |   └───question_id
|   |   |   |   image.png
|   |   |   |   data.json
|   |   | 
|   |   └───question_id
|   |   |   ...
|   |
|   └───fill_in_blank
|       |  
|       └───question_id
|       |   |   image.png
|       |   |   data.json
|       | 
|       └───question_id
|       |   ...
|   
└───train
|   |   same as test
|   
└───val
    |   same as test

File structures for the Icon645 dataset:

Icon645
|   LICENCE.md
|   metadata.json
└───colored_icons_final
    |
    └───acorn
    |   |   image_id1.png
    |   |   image_id2.png
    |   |   ...
    |   
    └───airplane
    |   |   image_id3.png
    |   |   ...
    |      
    |   ...

Citation

If the paper or the dataset inspires you, please cite us:

@inproceedings{lu2021iconqa,
  title = {IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning},
  author = {Lu, Pan and Qiu, Liang and Chen, Jiaqi and Xia, Tony and Zhao, Yizhou and Zhang, Wei and Yu, Zhou and Liang, Xiaodan and Zhu, Song-Chun},
  booktitle = {Submitted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks},
  year = {2021}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

Related tags

Overview

IconQA

About

Download

Citation

License

Owner

Pan Lu

Collection of machine learning related notebooks to share.

A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation

A Tensorflow implementation of the Text Conditioned Auxiliary Classifier Generative Adversarial Network for Generating Images from text descriptions

Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Tensors and Dynamic neural networks in Python with strong GPU acceleration

A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot. Graph Convolutional Networks for Hyperspectral Image Classification, IEEE TGRS, 2021.

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning

EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings

Automatic learning-rate scheduler

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.