Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Last update: Dec 23, 2021

Related tags

Overview

ALPHAMEPOL

This repository contains the implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Installation

In order to use this codebase you need to work with a Python version >= 3.6. Moreover, you need to have a working setup of Mujoco with a valid Mujco license. To setup Mujoco, have a look here. To avoid any conflict with your existing Python setup, and to keep this project self-contained, it is suggested to work in a virtual environment with virtualenv. To install virtualenv:

pip install --upgrade virtualenv

Create a virtual environment, activate it and install the requirements:

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

Unsupervised Pre-Training

To reproduce the Unsupervised Pre-Training experiments in the paper, run:

./scripts/exploration/[gridworld_with_slope.sh | multigrid.sh | ant.sh | minigrid.sh]

Supervised Fine-Tuning

To reproduce the Supervised Fine-Tuning experiments, run:

./scripts/goal_rl/[gridworld_with_slope.sh | multigrid.sh | ant.sh | minigrid.sh]

By default, this will launch TRPO with ALPHAMEPOL initialization. To launch TRPO with a random initialization, simply omit the policy_init argument in the scripts.

Moreover, note that the scripts for the GridWorld with Slope and MultiGrid experiments have the argument num_goals = 50, meaning that the training will be performed with one goal at a time. If you want to speed up the process, you can use several processes (ideally one for each goal), by passing as argument num_goals = 1 and changing incrementally the seed. As regards the Ant and MiniGrid experiments, since the goals are predefined, you can also set the goal_index argument to specify a goal (from 0 to 7 and from 0 to 12 respectively).

Results Visualization

Once launched, each experiment will log statistics in the results folder. You can visualize everything by launching tensorboard targeting that directory:

python -m tensorboard.main --logdir=./results --port 8080

and visiting the board at http://localhost:8080.

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Related tags

Overview

ALPHAMEPOL

Installation

Usage

Unsupervised Pre-Training

Supervised Fine-Tuning

Results Visualization

Owner

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

TensorFlow implementation of Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction)

Yet Another Reinforcement Learning Tutorial

Reinfore learning tool box, contains trpo, a3c algorithm for continous action space

Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

ConformalLayers: A non-linear sequential neural network with associative layers

Pytorch implementation for DFN: Distributed Feedback Network for Single-Image Deraining.

Users can free try their models on SIDD dataset based on this code

ReferFormer - Official Implementation of ReferFormer

Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

A Pose Estimator for Dense Reconstruction with the Structured Light Illumination Sensor

Revisiting Temporal Alignment for Video Restoration

Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Existing Literature about Machine Unlearning

Accelerated deep learning R&D

A Simple Key-Value Data-store written in Python

Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Data and codes for ACL 2021 paper: Towards Emotional Support Dialog Systems

Implementing Vision Transformer (ViT) in PyTorch