Hooks for VCOCO

Last update: Nov 24, 2022

Related tags

Overview

Verbs in COCO (V-COCO) Dataset

This repository hosts the Verbs in COCO (V-COCO) dataset and associated code to evaluate models for the Visual Semantic Role Labeling (VSRL) task as ddescribed in this technical report.

Citing

If you find this dataset or code base useful in your research, please consider citing the following papers:

@article{gupta2015visual,
  title={Visual Semantic Role Labeling},
  author={Gupta, Saurabh and Malik, Jitendra},
  journal={arXiv preprint arXiv:1505.04474},
  year={2015}
}

@incollection{lin2014microsoft,
  title={Microsoft COCO: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={Computer Vision--ECCV 2014},
  pages={740--755},
  year={2014},
  publisher={Springer}
}

Installation

Clone repository (recursively, so as to include COCO API).

git clone --recursive https://github.com/s-gupta/v-coco.git

This dataset builds off MS COCO, please download MS-COCO images and annotations.
Current V-COCO release only uses a subset of MS-COCO images (Image IDs listed in data/splits/vcoco_all.ids). Use the following script to pick out annotations from the COCO annotations to allow faster loading in V-COCO.
```
# Assume you cloned the repository to `VCOCO_DIR'
cd $VCOCO_DIR
# If you downloaded coco annotations to coco-data/annotations
python script_pick_annotations.py coco-data/annotations
```

Build coco/PythonAPI/pycocotools/_mask.so, cython_bbox.so.

# Assume you cloned the repository to `VCOCO_DIR'
cd $VCOCO_DIR/coco/PythonAPI/ && make
cd $VCOCO_DIR && make

Using the dataset

An IPython notebook, illustrating how to use the annotations in the dataset is available in V-COCO.ipynb
The current release of the dataset includes annotations as indicated in Table 1 in the paper. We are collecting role annotations for the 6 categories (that are missing) and will make them public shortly.

Evaluation

We provide evaluation code that computes agent AP and role AP, as explained in the paper.

In order to use the evaluation code, store your predictions as a pickle file (.pkl) in the following format:

[ {'image_id':        # the coco image id,
   'person_box':      #[x1, y1, x2, y2] the box prediction for the person,
   '[action]_agent':  # the score for action corresponding to the person prediction,
   '[action]_[role]': # [x1, y1, x2, y2, s], the predicted box for role and 
                      # associated score for the action-role pair.
   } ]

Assuming your detections are stored in det_file=/path/to/detections/detections.pkl, do

from vsrl_eval import VCOCOeval
vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
  # e.g. vsrl_annot_file: data/vcoco/vcoco_val.json
  #      coco_file:       data/instances_vcoco_all_2014.json
  #      split_file:      data/splits/vcoco_val.ids
vcocoeval._do_eval(det_file, ovr_thresh=0.5)

We introduce two scenarios for role AP evaluation.

[Scenario 1] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 & the corresponding role is empty e.g. [0,0,0,0] or [NaN,NaN,NaN,NaN]. This scenario is fit for missing roles due to occlusion.
[Scenario 2] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 (the corresponding role is ignored). This scenario is fit for the cases with roles outside the COCO categories.

Hooks for VCOCO

Related tags

Overview

Verbs in COCO (V-COCO) Dataset

Citing

Installation

Using the dataset

Evaluation

Owner

Saurabh Gupta

R3Det based on mmdet 2.19.0

Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

Vision-and-Language Navigation in Continuous Environments using Habitat

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

VGG16 model-based classification project about brain tumor detection.

[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

Visual Adversarial Imitation Learning using Variational Models (VMAIL)

[ECCV 2020] Gradient-Induced Co-Saliency Detection

StarGAN - Official PyTorch Implementation (CVPR 2018)

Social Network Ads Prediction

[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Make a surveillance camera from your raspberry pi!

Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks

Neural Dynamic Policies for End-to-End Sensorimotor Learning

Face Recognition and Emotion Detector Device

Project page for End-to-end Recovery of Human Shape and Pose