Tensorflow AffordanceNet and AffContext implementations

Overview

AffordanceNet and AffContext

This is tensorflow AffordanceNet and AffContext implementations. Both are implemented and tested with tensorflow 2.3.

The main objective of both architectures is to identify action affordances, so that they can be used in real robotic applications to understand the diverse objects present in the environment.

Both models have been trained on IIT-AFF and UMD datasets.

Detections on novel image

Novel image

Example of ground truth affordances compared with the affordance detection results by AffordanceNet and AffContext on the IIT-AFF dataset.

IIT results

IIT colours

Example of ground truth affordances compared with the affordance detection results by AffordanceNet and AffContext on the UMD dataset.

UMD results

UMD colours

AffordanceNet simultaneously detects multiple objects with their corresponding classes and affordances. This network mainly consists of two branches: an object detection branch to localise and classify the objects in the image, and an affordance detection branch to predict the most probable affordance label for each pixel in the object.

AffordanceNet

AffContext correctly predicts the pixel-wise affordances independently of the class of the object, which allows to infer the affordances for unseen objects. The structure of this network is similar to AffordanceNet, but the object detection branch only performs binary classification into foreground and background areas, and it includes two new blocks: an auxiliary task to infer the affordances in the region and a self-attention mechanism to capture rich contextual dependencies through the region.

AffContext

Results

The results of the tensorflow implementation are contrasted with the values provided in the papers from AffordanceNet and AffContext. However, since the procedure of how the results are processed to obtain the final metrics in both networks may be different, the results are also compared with the values obtained by running the original trained models, but processing the outputs and calculating the measures with the code from this repository. These results are denoted with * in the comparison tables.

Affordances AffordanceNet
(Caffe)
AffordanceNet* AffordanceNet
(tf)
contain 79.61 73.68 74.17
cut 75.68 64.71 66.97
display 77.81 82.81 81.84
engine 77.50 81.09 82.63
grasp 68.48 64.13 65.49
hit 70.75 82.13 83.25
pound 69.57 65.90 65.73
support 69.57 74.43 75.26
w-grasp 70.98 77.63 78.45
Average 73.35 74.06 74.87
Affordances AffContext
(Caffe)
AffContext* AffContext
(tf)
grasp 0.60 0.51 0.55
cut 0.37 0.31 0.26
scoop 0.60 0.52 0.52
contain 0.61 0.55 0.57
pound 0.80 0.68 0.64
support 0.88 0.69 0.21
w-grasp 0.94 0.88 0.85
Average 0.69 0.59 0.51

Setup guide

Requirements

  • Python 3
  • CUDA 10.1

Installation

  1. Clone the repository into your $AffordanceNet_ROOT folder.

  2. Install the required Python3 packages with: pip3 install -r requirements.txt

Testing

  1. Download the pretrained weights:

    • AffordanceNet weights trained on IIT-AFF dataset.
    • AffContext weights trained on UMD dataset.
  2. Extract the file into $AffordanceNet_ROOT/weights folder.

  3. Visualize results for AffordanceNet trained on IIT-AFF dataset:

python3 affordancenet_predictor.py --config_file config_iit_test
  1. Visualize results for AffContext trained on UMD dataset:
python3 affcontext_predictor.py --config_file config_umd_test

Training

  1. Download the IIT-AFF or UMD datasets in Pascal-VOC format following the instructions in AffordanceNet (IIT-AFF) and AffContext(UMD).

  2. Extract them into the $AffordanceNet_ROOT/data folder and make sure to have the following folder structure for IIT-AFF dataset:

    • cache/
    • VOCdevkit2012/

The same applies for UMD dataset, but folder names should be cache_UMD and VOCdevkit2012_UMD

  1. Run the command to train AffordanceNet on IIT-AFF dataset:
python3 affordancenet_trainer.py --config_file config_iit_train
  1. Run the command to train AffContext on UMD dataset:
python3 affcontext_trainer.py --config_file config_umd_train

Acknowledgements

This repo used source code from AffordanceNet and Faster-RCNN

Owner
Beatriz Pérez
MSc student in Computer Science at Universität Bonn, Germany. Computer Engineer from Universidad de Zaragoza, Spain.
Beatriz Pérez
CS583: Deep Learning

CS583: Deep Learning

Shusen Wang 2.6k Dec 30, 2022
UPSNet: A Unified Panoptic Segmentation Network

UPSNet: A Unified Panoptic Segmentation Network Introduction UPSNet is initially described in a CVPR 2019 oral paper. Disclaimer This repository is te

Uber Research 622 Dec 26, 2022
Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa

Robin Jia 38 Oct 16, 2022
Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker This is a full project of image segmentation using the model built with

Htin Aung Lu 1 Jan 04, 2022
classification task on dataset-CIFAR10,by using Tensorflow/keras

CIFAR10-Tensorflow classification task on dataset-CIFAR10,by using Tensorflow/keras 在这一个库中,我使用Tensorflow与keras框架搭建了几个卷积神经网络模型,针对CIFAR10数据集进行了训练与测试。分别使

3 Oct 17, 2021
A Framework for Encrypted Machine Learning in TensorFlow

TF Encrypted is a framework for encrypted machine learning in TensorFlow. It looks and feels like TensorFlow, taking advantage of the ease-of-use of t

TF Encrypted 0 Jul 06, 2022
🔊 Audio and fastai v2

Fastaudio An audio module for fastai v2. We want to help you build audio machine learning applications while minimizing the need for audio domain expe

152 Dec 28, 2022
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

41 Jan 03, 2023
A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Imbalanced Dataset Sampler Introduction In many machine learning applications, we often come across datasets where some types of data may be seen more

Ming 2k Jan 08, 2023
[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Self-Damaging Contrastive Learning Introduction The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervis

VITA 51 Dec 29, 2022
This repo generates the training data and the model for Morpheus-Deblend

Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as

Ryan Hausen 2 Apr 18, 2022
yolov5 deepsort 行人 车辆 跟踪 检测 计数

yolov5 deepsort 行人 车辆 跟踪 检测 计数 实现了 出/入 分别计数。 默认是 南/北 方向检测,若要检测不同位置和方向,可在 main.py 文件第13行和21行,修改2个polygon的点。 默认检测类别:行人、自行车、小汽车、摩托车、公交车、卡车。 检测类别可在 detect

554 Dec 30, 2022
Justmagic - Use a function as a method with this mystic script, like in Nim

justmagic Use a function as a method with this mystic script, like in Nim. Just

witer33 8 Oct 08, 2022
Official PyTorch implementation of the paper "Graph-based Generative Face Anonymisation with Pose Preservation" in ICIAP 2021

Contents AnonyGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evaluation Acknowledgments Citat

Nicola Dall'Asen 10 May 24, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 21 Dec 22, 2022
DISTIL: Deep dIverSified inTeractIve Learning.

DISTIL: Deep dIverSified inTeractIve Learning. An active/inter-active learning library built on py-torch for reducing labeling costs.

decile-team 110 Dec 06, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
A simple Rock-Paper-Scissors game using CV in python

ML18_Rock-Paper-Scissors-using-CV A simple Rock-Paper-Scissors game using CV in python For IITISOC-21 Rules and procedure to play the interactive game

Anirudha Bhagwat 3 Aug 08, 2021
Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

SAGCN - Official PyTorch Implementation | Paper | Project Page This is the official implementation of the paper "Steganographer detection via a simila

ZHANG Zhi 1 Nov 26, 2021
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "

程星 87 Dec 24, 2022