Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Overview

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Kajetan Schweighofer1, Markus Hofmarcher1, Marius-Constantin Dinu1,3, Philipp Renz1, Angela Bitto-Nemling1, Vihang Patil1, Sepp Hochreiter1, 2

1 ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
2 Institute of Advanced Research in Artificial Intelligence (IARAI)
3 Dynatrace Research


The paper is available on arxiv


Implementation

This repository contains implementations of BC, BVE, MCE, DQN, QR-DQN, REM, BCQ, CQL and CRR, used for our evaluation of Offline RL datasets. Implementation-wise, algorithms can in theory be used in the usual Online RL setting as well as Offline RL settings. Furthermore, utilities for offline dataset evaluation and plotting of results are contained.

Experiments are managed through experimental files (ex_01.py, ex_02.py, ...). While this is not a necessity, we created an experimental file for each of the six environments used to obtain our results, to more easily distribute experiments across multiple devices.

Dependencies

To reproduce all results we provide an environment.yml file to setup a conda environment with the required packages. Run the following command to create and activate the environment:

conda env create --file environment.yml
conda activate offline_rl
pip install -e .

Usage

To create datasets for Offline RL, each experimental file needs to be run by

python ex_XX.py --online

After this run has finished, datasets for Offline RL are created, which are then used for applying algorithms in the Offline RL setting. Offline experiments are started with

python ex_XX.py

Runtimes will be long, especially on MinAtar environments, which is why distribution across multiple machines is crucial in this step. To distribute across multiple machines, two further command line arguments are eligible, --run and --dataset. Depending on how many runs have been done to create datasets for Offline RL (five in the paper), one can select a specific version of the dataset with the first parameter. For the results in the paper, five different datasets are created (random, mixed, replay, noisy, expert), which can be selected by its number using the second parameter.

As an example, offline experiments using the fourth dataset creation run on the expert dataset is started with

python ex_XX.py --run 3 --dataset 4

or using the first dataset creation run on the replay dataset

python ex_XX.py --run 0 --dataset 2

Results

After all experiments are concluded, one has to combine the logged files and create the plots by executing

python source/plotting/join_csv_files.py
python source/plotting/create_plots.py

Furthermore, plots for the training curves can be created by executing

python source/plotting/learning_curves.py

Alternative visualisations of the main results, using parallel coordinates are available by executing

python source/plotting/parallel_coordinates.py

LICENSE

MIT LICENSE

Owner
Institute for Machine Learning, Johannes Kepler University Linz
Software of the Institute for Machine Learning, JKU Linz
Institute for Machine Learning, Johannes Kepler University Linz
PROJECT - Az Residential Real Estate Analysis

AZ RESIDENTIAL REAL ESTATE ANALYSIS -Decided on libraries to import. Includes pa

2 Jul 05, 2022
Small little script to scrape, parse and check for active tor nodes. Can be used as proxies.

TorScrape TorScrape is a small but useful script made in python that scrapes a website for active tor nodes, parse the html and then save the nodes in

5 Dec 04, 2022
TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

Ivana Balazevic 296 Dec 06, 2022
Official implementation for “Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior”

Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior. The code will release soon. Implementation Python3 PyTorch=1.0 NVIDIA GPU+

FengZhang 34 Dec 04, 2022
Official Code Release for "CLIP-Adapter: Better Vision-Language Models with Feature Adapters"

Official Code Release for "CLIP-Adapter: Better Vision-Language Models with Feature Adapters" Pipeline of CLIP-Adapter CLIP-Adapter is a drop-in modul

peng gao 157 Dec 26, 2022
A Python library for differentiable optimal control on accelerators.

A Python library for differentiable optimal control on accelerators.

Google 80 Dec 21, 2022
Official Pytorch Implementation of Adversarial Instance Augmentation for Building Change Detection in Remote Sensing Images.

IAug_CDNet Official Implementation of Adversarial Instance Augmentation for Building Change Detection in Remote Sensing Images. Overview We propose a

53 Dec 02, 2022
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 46.9k Jan 03, 2023
3.8% and 18.3% on CIFAR-10 and CIFAR-100

Wide Residual Networks This code was used for experiments with Wide Residual Networks (BMVC 2016) http://arxiv.org/abs/1605.07146 by Sergey Zagoruyko

Sergey Zagoruyko 1.2k Dec 29, 2022
Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

TianYuan 27 Nov 07, 2022
A PyTorch implementation of SIN: Superpixel Interpolation Network

SIN: Superpixel Interpolation Network This is is a PyTorch implementation of the superpixel segmentation network introduced in our PRICAI-2021 paper:

6 Sep 28, 2022
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing Environment Tested on Ubuntu 14.04 64bit and 16.04 64bit Installation # disabl

gts3.org (<a href=[email protected])"> 581 Dec 30, 2022
[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement Announcement 🔥 We have not tested the code yet. We will fini

Xiuwei Xu 7 Oct 30, 2022
Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions Accepted by AAAI 2022 [arxiv] Wenyu Liu, Gaofeng Ren, Runsheng Yu, Shi Guo, Jia

liuwenyu 245 Dec 16, 2022
Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

DD3D: "Is Pseudo-Lidar needed for Monocular 3D Object detection?" Install // Datasets // Experiments // Models // License // Reference Full video Offi

Toyota Research Institute - Machine Learning 364 Dec 27, 2022
This is a Image aid classification software based on python TK library development

This is a Image aid classification software based on python TK library development.

EasonChan 1 Jan 17, 2022
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

DiffWave DiffWave is a fast, high-quality neural vocoder and waveform synthesizer. It starts with Gaussian noise and converts it into speech via itera

LMNT 498 Jan 03, 2023