PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Last update: Dec 08, 2022

Related tags

Deep Learning idaac

Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

This is a PyTorch implementation of the methods proposed in

Decoupling Value and Policy for Generalization in Reinforcement Learning by

Roberta Raileanu and Rob Fergus.

Citation

If you use this code in your own work, please cite our paper:

@article{Raileanu2021DecouplingVA,
  title={Decoupling Value and Policy for Generalization in Reinforcement Learning},
  author={Roberta Raileanu and R. Fergus},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.10330}
}

Requirements

To install all the required dependencies:

conda create -n idaac python=3.7
conda activate idaac

cd idaac
pip install -r requirements.txt

pip install procgen

git clone https://github.com/openai/baselines.git
cd baselines 
python setup.py install

Instructions

This repo provides instructions for training IDAAC, DAAC, and PPO on the Procgen benchmark.

Train IDAAC on CoinRun

python train.py --env_name coinrun --algo idaac

Train DAAC on CoinRun

python train.py --env_name coinrun --algo daac

Train PPO on CoinRun

python train.py --env_name coinrun --algo ppo --ppo_epoch 3

Note: The default code uses the same set of hyperparameters (HPs) for all environments, which are the best ones overall. In our studies, we've found some of the games can further benefit from slightly different HPs, so we provide those as well. To use the best hyperparameters for each environment, use the flag --use_best_hps.

Overview of DAAC and IDAAC

Procgen Results

IDAAC achieves state-of-the-art performance on the Procgen benchmark (easy mode), significantly improving the agent's generalization ability over standard RL methods such as PPO.

Test Results on Procgen

Acknowledgements

This code was based on an open sourced PyTorch implementation of PPO.

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Related tags

Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

Citation

Requirements

Instructions

Train IDAAC on CoinRun

Train DAAC on CoinRun

Train PPO on CoinRun

Overview of DAAC and IDAAC

Procgen Results

Acknowledgements

Owner

CS_Final_Metal_surface_detection - This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021.

Pytorch implementation of Depth-conditioned Dynamic Message Propagation forMonocular 3D Object Detection

A treasure chest for visual recognition powered by PaddlePaddle

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

Apollo optimizer in tensorflow

A curated list of awesome deep long-tailed learning resources.

Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

Official page of Patchwork (RA-L'21 w/ IROS'21)

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 2021)

A framework for joint super-resolution and image synthesis, without requiring real training data

FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

Discriminative Condition-Aware PLDA

[ICSE2020] MemLock: Memory Usage Guided Fuzzing

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

learning and feeling SLAM together with hands-on-experiments

This is the reference implementation for "Coresets via Bilevel Optimization for Continual Learning and Streaming"

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021