Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics

Overview

[AAAI2022] Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics

Overall pipeline of OCN.

Paper Link: [arXiv] [AAAI official paper]

If you find our work or the codebase inspiring and useful to your research, please cite

@article{yuan2022OCN_HOI,
  title={Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics},
  author={Yuan, Hangjie and Wang, Mang and Ni, Dong and Xu, Liangpeng},
  journal={arXiv preprint arXiv:2202.00259},
  year={2022}
}

Dataset preparation

1. HICO-DET

HICO-DET dataset can be downloaded here. After finishing downloading, unpack the tarball (hico_20160224_det.tar.gz) to the data directory.

Instead of using the original annotations files, we use the annotation files provided by the PPDM authors. The annotation files can be downloaded from here. The downloaded annotation files have to be placed as follows.

qpic
 |─ data
 │   └─ hico_20160224_det
 |       |─ annotations
 |       |   |─ trainval_hico.json
 |       |   |─ test_hico.json
 |       |   └─ corre_hico.npy
 :       :

2. V-COCO

First clone the repository of V-COCO from here, and then follow the instruction to generate the file instances_vcoco_all_2014.json. Next, download the prior file prior.pickle from here. Place the files and make directories as follows.

qpic
 |─ data
 │   └─ v-coco
 |       |─ data
 |       |   |─ instances_vcoco_all_2014.json
 |       |   :
 |       |─ prior.pickle
 |       |─ images
 |       |   |─ train2014
 |       |   |   |─ COCO_train2014_000000000009.jpg
 |       |   |   :
 |       |   └─ val2014
 |       |       |─ COCO_val2014_000000000042.jpg
 |       |       :
 |       |─ annotations
 :       :

For our implementation, the annotation file have to be converted to the HOIA format. The conversion can be conducted as follows.

PYTHONPATH=data/v-coco \
        python convert_vcoco_annotations.py \
        --load_path data/v-coco/data \
        --prior_path data/v-coco/prior.pickle \
        --save_path data/v-coco/annotations

Note that only Python2 can be used for this conversion because vsrl_utils.py in the v-coco repository shows a error with Python3.

V-COCO annotations with the HOIA format, corre_vcoco.npy, test_vcoco.json, and trainval_vcoco.json will be generated to annotations directory.

Dependencies and Training

To simplify the steps, we combine the installation of externel dependencies and training into one '.sh' file. You can directly run the codes after rightly preparing the dataset.

# Training on HICO-DET
bash train_hico.sh
# Training on V-COCO
bash train_vcoco.sh

Note that you can refer to the publicly available codebase for the preparation of two datasets.

Pre-trained parameters

OCN uses COCO pretrained models for fair comparisons with previous methods. The pretrained models can be downloaded from DETR repository.

For HICO-DET, you can convert the pre-trained parameters with the following command.

python convert_parameters.py \
        --load_path /PATH/TO/PRETRAIN \
        --save_path /PATH/TO/SAVE

For V-COCO, you can convert the pre-trained parameters with the following command.

python convert_parameters.py \
        --load_path /PATH/TO/PRETRAIN \
        --save_path /PATH/TO/SAVE \
        --dataset vcoco \

Evaluation

The mAP on HICO-DET under the Full set, Rare set and Non-Rare Set will be reported during the training process. Or you can evaluate the performance using commands below:

python main.py \
    --pretrained /PATH/TO/PRETRAINED_MODEL \
    --output_dir /PATH/TO/OUTPUT \
    --hoi \
    --dataset_file hico \
    --hoi_path /PATH/TO/data/hico_20160224_det \
    --num_obj_classes 80 \
    --num_verb_classes 117 \
    --backbone resnet101 \
    --num_workers 4 \
    --batch_size 4 \
    --exponential_hyper 1 \
    --exponential_loss \
    --semantic_similar_coef 1 \
    --verb_loss_type focal \
    --semantic_similar \
    --OCN \
    --eval \

The results for the official evaluation of V-COCO must be obtained by the generated pickle file of detection results.

python generate_vcoco_official.py \
        --param_path /PATH/TO/CHECKPOINT \
        --save_path /PATH/TO/SAVE/vcoco.pickle \
        --hoi_path /PATH/TO/VCOCO/data/v-coco \
        --batch_size 4 \
        --OCN \

Then you should run following codes after modifying the path to get the final performance:

python datasets/vsrl_eval.py

Results

Below we present the results and links for downloading corresponding parameters and logs: (The checkpoints can produce higher results than what are reported in the paper.) We will soon update this table.

Owner
A Ph.D. candidate and a realistic idealist.
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
Notebook and code to synthesize complex and highly dimensional datasets using Gretel APIs.

Gretel Trainer This code is designed to help users successfully train synthetic models on complex datasets with high row and column counts. The code w

Gretel.ai 24 Nov 03, 2022
Pytorch implementation of Learning with Opponent-Learning Awareness

Pytorch implementation of Learning with Opponent-Learning Awareness using DiCE

Alexis David Jacq 82 Sep 15, 2022
Unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

Pytorch Implementation of Augmenting Convolutional networks with attention-based aggregation This is the unofficial PyTorch Implementation of "Augment

DK 20 Sep 09, 2022
A library for building and serving multi-node distributed faiss indices.

About Distributed faiss index service. A lightweight library that lets you work with FAISS indexes which don't fit into a single server memory. It fol

Meta Research 170 Dec 30, 2022
An 16kHz implementation of HiFi-GAN for soft-vc.

HiFi-GAN An 16kHz implementation of HiFi-GAN for soft-vc. Relevant links: Official HiFi-GAN repo HiFi-GAN paper Soft-VC repo Soft-VC paper Example Usa

Benjamin van Niekerk 42 Dec 27, 2022
Tutorial page of the Climate Hack, the greatest hackathon ever

Tutorial page of the Climate Hack, the greatest hackathon ever

UCL Artificial Intelligence Society 12 Jul 02, 2022
Pytorch implementation of Bert and Pals: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

PyTorch implementation of BERT and PALs Introduction Work by Asa Cooper Stickland and Iain Murray, University of Edinburgh. Code for BERT and PALs; mo

Asa Cooper Stickland 70 Dec 29, 2022
PyTorch version implementation of DORN

DORN_PyTorch This is a PyTorch version implementation of DORN Reference H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao: Deep Ordinal Regression

Zilin.Zhang 3 Apr 27, 2022
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 07, 2022
Repository of continual learning papers

Continual learning paper repository This repository contains an incomplete (but dynamically updated) list of papers exploring continual learning in ma

29 Jan 05, 2023
Neural implicit reconstruction experiments for the Vector Neuron paper

Neural Implicit Reconstruction with Vector Neurons This repository contains code for the neural implicit reconstruction experiments in the paper Vecto

Congyue Deng 35 Jan 02, 2023
Code for "AutoMTL: A Programming Framework for Automated Multi-Task Learning"

AutoMTL: A Programming Framework for Automated Multi-Task Learning This is the website for our paper "AutoMTL: A Programming Framework for Automated M

Ivy Zhang 40 Dec 04, 2022
Code for "Multi-Compound Transformer for Accurate Biomedical Image Segmentation"

News The code of MCTrans has been released. if you are interested in contributing to the standardization of the medical image analysis community, plea

97 Jan 05, 2023
👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

🦕 👨‍💻 nanosaur_gazebo nanosaur The smallest NVIDIA Jetson dinosaur robot, open-source, fully 3D printable, based on ROS2 & Isaac ROS. Designed & ma

nanosaur 9 Jul 19, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022
Adds timm pretrained backbone to pytorch's FasterRcnn model

Operating Systems Lab (ETCS-352) Experiments for Operating Systems Lab (ETCS-352) performed by me in 2021 at uni. All codes are written by me except t

Mriganka Nath 12 Dec 03, 2022
An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

CyberBattleSim April 8th, 2021: See the announcement on the Microsoft Security Blog. CyberBattleSim is an experimentation research platform to investi

Microsoft 1.5k Dec 25, 2022
Bagua is a flexible and performant distributed training algorithm development framework.

Bagua is a flexible and performant distributed training algorithm development framework.

786 Dec 17, 2022