Framework for training options with different attention mechanism and using them to solve downstream tasks.

Overview

Using Attention in HRL

Framework for training options with different attention mechanism and using them to solve downstream tasks.

Requirements

GPU required

conda env create -f conda_env.yml

After the instalation ends you can activate your environment and install remaining dependencies. (e.g. sub-module gym_minigrid which is a modified version of MiniGrid )

conda activate affenv
cd gym-minigrid
pip install -e .
cd ../
pip install -e .

Instructions

In order to train options and IC_net follow these steps:

1. Configure desired environment - number of task and objects per task in file config/op_ic_net.yaml. E.g:
  env_args:
    task_size: 3
    num_tasks: 4

2. Configure desired type of attention (between "affordance", "interest", "nan") - in file config/op_ic_net.yaml. E.g. 
main:
  attention: "affordance" 

3. Train by running command
liftoff train_main.py configs/op_ic_net.yaml

Once a pre-trained option checkpoint exists a HRL agent can be trained to solve the downstream task (for the same environment the options were trained on). Follow these steps in order to train an HRL-Agent with different types of attentions:

1. Configure checkpoint (experiment config file and options_model_id) for pre-trained Options and IC_net - in file configs/hrl-agent.yaml. E.g: 

main:
  options_model_cfg: "results/op_aff_4x3/0000_multiobj/0/cfg.yaml"
  options_model_id: -1  # Last checkpoint will be used

2. Configure type of attention for training the HRL-agent (between "affordance", "interest", "nan") - in file configs/hrl-agent.yaml. E.g:
main:
  modulate_policy: affordance

3. Train HRL-agent by running command
liftoff train_mtop_ppo.py configs/hrl-agent.yaml

Both training scrips produce results in the results folder, where all the outputs are going to be stored including train/eval logs, checkpoints. Live plotting is integrated using services from Wandb (plotting has to be enabled in the config file main:plot and user logged in Wandb or user login api key in the file .wandb_key).

The console output is also available in a form:

  • Option Pre-training e.g.:
U 11 | F 022528 | FPS 0024 | D 402 | rR:u, 0.03 | F:u, 41.77 | tL:u 0.00 | tPL:u 6.47 | tNL:u 0.00 | t 52 | aff_loss 0.0570 | aff 2.8628 | NOaff 0.0159 | ic 0.0312 | cnt_ic 1.0000 | oe 2.4464 | oic0 0.0000 | oic1 0.0000 | oic2 0.0000 | oic3 0.0000 | oPic0 0.0000 | oPic1 0.0000 | oPic2 0.0000 | oPic3 0.0000 | icB 0.0208 | PicB 0.1429 | icND 0.0192

Some of the training entries decodes as

F - number of frames (steps in the env)
tL - termination loss
aff_loss - IC_net loss
cnt_ic - Intent completion per training batch 
oicN - Intent completion fraction for each option N out of Total option N sampled
oPicN - Intent completion fraction for each option N out of affordable ones
PicB - Intent completion average over all options out of affordable ones
  • HRL-agent training
U 1 | F 4555192.0 | FPS 21767 | D 209 | rR:u, 0.00 | F:u, 8.11 | e:u, 2.48 | v:u 0.00 | pL:u 0.01 | vL:u 0.00 | g:u 0.01 | TrR:u, 0.00

Some of the training entries decodes as

F - number of frames (steps in the env offseted by the number of pre-training steps)
rR - Accumulated episode reward average
TrR - Average episode success rate

Framework structure

The code is organised as follows:

  • agents/ - implementation of agents (e.g. training options and IC_net multistep_affordance.py; hrl-agent PPO ppo_smdp.py )
  • configs/ - config files for training agents
  • gym-minigrid/ - sub-module - Minigrid envs
  • models/ - Neural network modules (e.g options with IC_net aff_multistep.py and CNN backbone extractor_cnn_v2.py)
  • utils/ - Scripts for e.g.: running envs in parallel, preprocessing observations, gym wrappers, data structures, logging modules
  • train_main.py - Train Options with IC_net
  • train_mtop_ppo.py - Train HRL-agent

Acknowledgements

We used PyTorch as a machine learning framework.

We used liftoff for experiment management.

We used wandb for plotting.

We used PPO adapted for training our agents.

We used MiniGrid to create our environment.

Bottleneck Transformers for Visual Recognition

Bottleneck Transformers for Visual Recognition Experiments Model Params (M) Acc (%) ResNet50 baseline (ref) 23.5M 93.62 BoTNet-50 18.8M 95.11% BoTNet-

Myeongjun Kim 236 Jan 03, 2023
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
This is the official pytorch implementation of AutoDebias, an automatic debiasing method for recommendation.

AutoDebias This is the official pytorch implementation of AutoDebias, a debiasing method for recommendation system. AutoDebias is proposed in the pape

Dong Hande 77 Nov 25, 2022
BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)

BabelCalib: A Universal Approach to Calibrating Central Cameras This repository contains the MATLAB implementation of the BabelCalib calibration frame

Yaroslava Lochman 55 Dec 30, 2022
StackNet is a computational, scalable and analytical Meta modelling framework

StackNet This repository contains StackNet Meta modelling methodology (and software) which is part of my work as a PhD Student in the computer science

Marios Michailidis 1.3k Dec 15, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 28 Nov 25, 2022
Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)

Hierarchical Metadata-Aware Document Categorization under Weak Supervision This project provides a weakly supervised framework for hierarchical metada

Yu Zhang 53 Sep 17, 2022
Alleviating Over-segmentation Errors by Detecting Action Boundaries

Alleviating Over-segmentation Errors by Detecting Action Boundaries Forked from ASRF offical code. This repo is the a implementation of replacing orig

13 Dec 12, 2022
A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

443 Jan 06, 2023
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte

Benedek Rozemberczki 201 Nov 09, 2022
Instance-based label smoothing for improving deep neural networks generalization and calibration

Instance-based Label Smoothing for Neural Networks Pytorch Implementation of the algorithm. This repository includes a new proposed method for instanc

Mohamed Maher 1 Aug 13, 2022
RGB-D Local Implicit Function for Depth Completion of Transparent Objects

RGB-D Local Implicit Function for Depth Completion of Transparent Objects [Project Page] [Paper] Overview This repository maintains the official imple

NVIDIA Research Projects 43 Dec 12, 2022
A platform to display the carbon neutralization information for researchers, decision-makers, and other participants in the community.

Welcome to Carbon Insight Carbon Insight is a platform aiming to display the carbon neutralization roadmap for researchers, decision-makers, and other

Microsoft 14 Oct 24, 2022
Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

AI Summer 962 Dec 23, 2022
A modern pure-Python library for reading PDF files

pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and

6 Apr 06, 2022
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

Yuliang Liu 266 Nov 24, 2022
EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising By Tengfei Liang, Yi Jin, Yidong Li, Tao Wang. Th

workingcoder 115 Jan 05, 2023
Data Augmentation with Variational Autoencoders

Documentation Pyraug This library provides a way to perform Data Augmentation using Variational Autoencoders in a reliable way even in challenging con

112 Nov 30, 2022
中文语音识别系列,读者可以借助它快速训练属于自己的中文语音识别模型,或直接使用预训练模型测试效果。

MASR中文语音识别(pytorch版) 开箱即用 自行训练 使用与训练分离(增量训练) 识别率高 说明:因为每个人电脑机器不同,而且有些安装包安装起来比较麻烦,强烈建议直接用我编译好的docker环境跑 目前docker基础环境为ubuntu-cuda10.1-cudnn7-pytorch1.6.

发送小信号 180 Dec 17, 2022
Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

Neelesh C A 3 Oct 14, 2022