Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Last update: Dec 28, 2022

Related tags

Overview

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Kajetan Schweighofer¹, Markus Hofmarcher¹, Marius-Constantin Dinu^1,3, Philipp Renz¹, Angela Bitto-Nemling¹, Vihang Patil¹, Sepp Hochreiter^{1, 2}

¹ ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
² Institute of Advanced Research in Artificial Intelligence (IARAI)
³ Dynatrace Research

The paper is available on arxiv

Implementation

This repository contains implementations of BC, BVE, MCE, DQN, QR-DQN, REM, BCQ, CQL and CRR, used for our evaluation of Offline RL datasets. Implementation-wise, algorithms can in theory be used in the usual Online RL setting as well as Offline RL settings. Furthermore, utilities for offline dataset evaluation and plotting of results are contained.

Experiments are managed through experimental files (ex_01.py, ex_02.py, ...). While this is not a necessity, we created an experimental file for each of the six environments used to obtain our results, to more easily distribute experiments across multiple devices.

Dependencies

To reproduce all results we provide an environment.yml file to setup a conda environment with the required packages. Run the following command to create and activate the environment:

conda env create --file environment.yml
conda activate offline_rl
pip install -e .

Usage

To create datasets for Offline RL, each experimental file needs to be run by

python ex_XX.py --online

After this run has finished, datasets for Offline RL are created, which are then used for applying algorithms in the Offline RL setting. Offline experiments are started with

python ex_XX.py

Runtimes will be long, especially on MinAtar environments, which is why distribution across multiple machines is crucial in this step. To distribute across multiple machines, two further command line arguments are eligible, --run and --dataset. Depending on how many runs have been done to create datasets for Offline RL (five in the paper), one can select a specific version of the dataset with the first parameter. For the results in the paper, five different datasets are created (random, mixed, replay, noisy, expert), which can be selected by its number using the second parameter.

As an example, offline experiments using the fourth dataset creation run on the expert dataset is started with

python ex_XX.py --run 3 --dataset 4

or using the first dataset creation run on the replay dataset

python ex_XX.py --run 0 --dataset 2

Results

After all experiments are concluded, one has to combine the logged files and create the plots by executing

python source/plotting/join_csv_files.py
python source/plotting/create_plots.py

Furthermore, plots for the training curves can be created by executing

python source/plotting/learning_curves.py

Alternative visualisations of the main results, using parallel coordinates are available by executing

python source/plotting/parallel_coordinates.py

LICENSE

MIT LICENSE

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Related tags

Overview

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Implementation

Dependencies

Usage

Results

LICENSE

Owner

Institute for Machine Learning, Johannes Kepler University Linz

PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.

We are More than Our JOints: Predicting How 3D Bodies Move

Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

LeetCode Solutions https://t.me/tenvlad

Minimal fastai code needed for working with pytorch

Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Pytorch implementation of RED-SDS (NeurIPS 2021).

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

MANO hand model porting for the GraspIt simulator

VGGFace2-HQ - A high resolution face dataset for face editing purpose

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

SMD-Nets: Stereo Mixture Density Networks

Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Related tags

Overview

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Implementation

Dependencies

Usage

Results

LICENSE

Owner

Institute for Machine Learning, Johannes Kepler University Linz

PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

We are More than Our JOints: Predicting How 3D Bodies Move

Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

LeetCode Solutions https://t.me/tenvlad

Minimal fastai code needed for working with pytorch

Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Pytorch implementation of RED-SDS (NeurIPS 2021).

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

MANO hand model porting for the GraspIt simulator

VGGFace2-HQ - A high resolution face dataset for face editing purpose

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

SMD-Nets: Stereo Mixture Density Networks

Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.