Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Last update: Nov 23, 2022

Overview

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble

This is the code for reproducing the results of the paper Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble accepted at NeurIPS'2021.

This code builds up from the offical code of Reset-Free Lifelong Learning with Skill-Space Planning, originally derived from rlkit.

If you find this repository useful for your research, please cite:

@inproceedings{
    an2021edac,
    title={Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble},
    author={Gaon An and Seungyong Moon and Jang-Hyun Kim and Hyun Oh Song},
    booktitle={Neural Information Processing Systems},
    year={2021}
}

Requirements

To install all the required dependencies:

Install MuJoCo engine, which can be downloaded from here.
Install Python packages listed in requirements.txt using pip. You should specify the versions of mujoco_py and dm_control in requirements.txt depending on the version of MuJoCo engine you have installed as follows:
- MuJoCo 2.0: mujoco-py<2.1,>=2.0, dm_control==0.0.364896371
- MuJoCo 2.1.0: mujoco-py<2.2,>=2.1, dm_control==0.0.403778684
- MuJoCo 2.1.1: to be updated
Manually download and install d4rl package from here. You should remove lines including dm_control in setup.py.

Here is an example of how to install all the dependencies on Ubuntu:

conda create -n edac python=3.7
conda activate edac
# Specify versions of mujoco-py and dm_control in requirements.txt
pip install --no-cache-dir -r requirements.txt

cd .
git clone https://github.com/rail-berkeley/d4rl.git

cd d4rl
# Remove lines including 'dm_control' in setup.py
pip install -e .

Reproducing the results

Gym

To reproduce SAC-N results for MuJoCo Gym, run:

python -m scripts.sac --env_name [ENVIRONMENT] --num_qs [N]

To reproduce EDAC results for MuJoCo Gym, run:

python -m scripts.sac --env_name [ENVIRONMENT] --num_qs [N] --eta [ETA]

Adroit

On Adroit tasks, we apply reward normalization for further training stability. For example, to reproduce the EDAC results for pen-human, run:

python -m scripts.sac --env_name pen-human-v1 --epoch 200 --num_qs 20 --plr 3e-5 --eta 1000 --reward_mean --reward_std

To reproduce the EDAC results for pen-cloned, run:

python -m scripts.sac --env_name pen-human-v1 --epoch 200 --num_qs 20 --plr 3e-5 --eta 10 --max_q_backup --reward_mean --reward_std

Acknowledgement

This work was supported in part by Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-00882, (SW STAR LAB) Development of deployable learning intelligence via self-sustainable and trustworthy machine learning and No. 2019-0-01371, Development of brain-inspired AI with human-like intelligence), and Research Resettlement Fund for the new faculty of Seoul National University. This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-20-1-4043.

Comments

Performance on D4RL AntMaze tasks?

Hello, I have a question about the performance of SAC-N or EDAC on AntMaze tasks. Have you ever tested it?

In my experiments based on this official implementation, I found that average returns in evaluation are always 0, which is worse than behavior cloning. Then I try to run with a modified reward (r=4*(r-0.5)) and max Q backup. However, they didn't help.

I'll appreciate it a lot if you give me some related advices. Thanks a lot.

opened by yuxudong20 3

Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović Aleksandar Stanić Stefan Bauer Jürgen Schmid

34 Jan 19, 2022

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

217 Jan 3, 2023

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

307 Jan 3, 2023

StyleGAN2-ADA - Official PyTorch implementation

Need Help? If you’re new to StyleGAN2-ADA and looking to get started, please check out this video series from a course Lia Coleman and I taught in Oct

217 Jan 4, 2023

Official PyTorch implementation of "ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows"

ArtFlow Official PyTorch implementation of the paper: ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows Jie An*, Siyu Huang*, Yibing

123 Dec 27, 2022

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

173 Dec 21, 2022

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

489 Jan 7, 2023

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

97 Dec 23, 2022

Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Related tags

Overview

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble

Requirements

Reproducing the results

Gym

Adroit

Acknowledgement

You might also like...

Official PyTorch implementation of Spatial Dependency Networks.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

StyleGAN2-ADA - Official PyTorch implementation

Official PyTorch implementation of "ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows"

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Comments

Performance on D4RL AntMaze tasks?

Releases(v1.0)

v1.0(Nov 21, 2021)

Owner

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

Quantum-enhanced transformer neural network

Project dự đoán giá cổ phiếu bằng thuật toán LSTM gồm: code train và code demo

Python Fanduel API (2021) - Lineup Automation

NAVER BoostCamp Final Project

A full-fledged version of Pix2Seq

The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

PyTorch deep learning projects made easy.

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

D²Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

Train a deep learning net with OpenStreetMap features and satellite imagery.

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

A Keras implementation of YOLOv3 (Tensorflow backend)

Back to Basics: Efficient Network Compression via IMP