Learning Off-Policy with Online Planning, CoRL 2021

Last update: Nov 22, 2022

Related tags

Deep Learning LOOP

Overview

LOOP: Learning Off-Policy with Online Planning

Accepted in Conference of Robot Learning (CoRL) 2021.

Harshit Sikchi, Wenxuan Zhou, David Held

Paper

Install

PyTorch 1.5
OpenAI Gym
MuJoCo
tqdm
D4RL dataset

File Structure

LOOP (Core method)
- Training code (Online RL): train_loop_sac.py
- Training code (Offline RL): train_loop_offline.py
- Training code (safe RL): train_loop_safety.py
- Policies (online/offline/safety): policies.py
- ARC/H-step lookahead policy: controllers/
Environments: envs/
Configurations: configs/

Instructions

All the experiments are to be run under the root folder.
Config files in configs/ are used to specify hyperparameters for controllers and dynamics.
Please keep all the other values in yml files consistent with hyperparamters given in paper to reproduce the results in our paper.

Experiments

Sec 6.1 LOOP for Online RL

python train_loop_sac.py --env=<env_name> --policy=LOOP_SAC_ARC --start_timesteps=<initial exploration steps> --exp_name=<location_to_logs>

Environments wrappers with their termination condition can be found under envs/

Sec 6.2 LOOP for Offline RL

Download CRR trained models from Link into the root folder.

python train_loop_offline.py --env=<env_name> --policy=LOOP_OFFLINE_ARC --exp_name=<location_to_logs>  --offline_algo=CRR --prior_type=CRR

Currently supported for d4rl MuJoCo locomotions tasks only.

Sec 6.3 LOOP for Safe RL

python train_loop_safety.py --env=<env_name> --policy=safeLOOP_ARC --exp_name=<location_to_logs>

Safety environments can be found under envs/safety_envs.py

References

Parts of the codes are used from the references mentioned below:

@article{SpinningUp2018,
    author = {Achiam, Joshua},
    title = {{Spinning Up in Deep Reinforcement Learning}},
    year = {2018}
}

https://github.com/Xingyu-Lin/mbpo_pytorch

Comments

Environment reproducibility

Hi, I am trying to run your code. However, I am trying to get packages prepared on newest version and have been encountering errors such as with mpi4py which does not install correctly in my environment.

Is it possible for you guys to provide a requirements.txt file for me to generate the python virtual environment that will set up the dependencies to run the code? Otherwise a container image such as docker will also be great!

opened by pranjaldhole 0

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

96 Dec 22, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out

3k Jan 9, 2023

3k Dec 31, 2022

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

8 Sep 5, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022

An all-in-one application to visualize multiple different local path planning algorithms

Learning Off-Policy with Online Planning, CoRL 2021

Related tags

Overview

LOOP: Learning Off-Policy with Online Planning

Install

File Structure

Instructions

Experiments

Sec 6.1 LOOP for Online RL

Sec 6.2 LOOP for Offline RL

Sec 6.3 LOOP for Safe RL

References

You might also like...

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Simple streamlit app to demonstrate HERE Tour Planning

An all-in-one application to visualize multiple different local path planning algorithms

GNPy: Optical Route Planning and DWDM Network Optimization

Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

Comments

Environment reproducibility

Releases(v0.0.0)

v0.0.0(Aug 27, 2022)

Owner

Harshit Sikchi

Bayesian Optimization using GPflow

PFFDTD is an open-source FDTD simulator for 3D room acoustics

Molecular AutoEncoder in PyTorch

Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

Implementation of the state-of-the-art vision transformers with tensorflow

Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Implementation of Feedback Transformer in Pytorch

A3C LSTM Atari with Pytorch plus A3G design

Use Python, OpenCV, and MediaPipe to control a keyboard with facial gestures

YOLOX + ROS(1, 2) object detection package

HyperaPy: An automatic hyperparameter optimization framework ⚡🚀

Official implementation of Monocular Quasi-Dense 3D Object Tracking

Face detection using deep learning.

A PyTorch implementation of "SimGNN: A Neural Network Approach to Fast Graph Similarity Computation" (WSDM 2019).

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

An 16kHz implementation of HiFi-GAN for soft-vc.

System Design course at HSE (2021)

Repository for Driving Style Recognition algorithms for Autonomous Vehicles

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

we propose a novel deep network, named feature aggregation and refinement network (FARNet), for the automatic detection of anatomical landmarks.