Pretraining Representations For Data-Efficient Reinforcement Learning

Last update: Dec 11, 2022

Related tags

Overview

Pretraining Representations For Data-Efficient Reinforcement Learning

Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman & Aaron Courville

This repo provides code for implementing SGI.

📦 Install -- Install relevant dependencies and the project
🔧 Usage -- Commands to run different experiments from the paper

Install

To install the requirements, follow these steps:

# PyTorch
export LANG=C.UTF-8
# Install requirements
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

# Finally, install the project
pip install --user -e .

Usage:

The default branch for the latest and stable changes is release.

To run SGI:

Download the DQN replay dataset from https://research.google/tools/datasets/dqn-replay/
- Or substitute your own pre-training data! The codebase expects a series of .gz files, one each for observations, actions and terminals.
To pretrain with SGI:

python -m scripts.run public=True model_folder=./ offline.runner.save_every=2500 \
    env.game=pong seed=1 offline_model_save={your model name} \
    offline.runner.epochs=10 offline.runner.dataloader.games=[Pong] \
    offline.runner.no_eval=1 \
    +offline.algo.goal_weight=1 \
    +offline.algo.inverse_model_weight=1 \
    +offline.algo.spr_weight=1 \
    +offline.algo.target_update_tau=0.01 \
    +offline.agent.model_kwargs.momentum_tau=0.01 \
    do_online=False \
    algo.batch_size=256 \
    +offline.agent.model_kwargs.noisy_nets_std=0 \
    offline.runner.dataloader.dataset_on_disk=True \
    offline.runner.dataloader.samples=1000000 \
    offline.runner.dataloader.checkpoints='{your checkpoints}' \
    offline.runner.dataloader.num_workers=2 \
    offline.runner.dataloader.data_path={your data dir} \
    offline.runner.dataloader.tmp_data_path=./

To fine-tune with SGI:

python -m scripts.run public=True env.game=pong seed=1 num_logs=10  \
    model_load={your_model_name} model_folder=./ \
    algo.encoder_lr=0.000001 algo.q_l1_lr=0.00003 algo.clip_grad_norm=-1 algo.clip_model_grad_norm=-1

When reporting scores, we average across 10 fine-tuning seeds.

./scripts/experiments contains a number of example configurations, including for SGI-M, SGI-M/L and SGI-W, for both pre-training and fine-tuning. Each of these scripts can be launched by providing a game and seed, e.g., ./scripts/experiments/sgim_pretrain.sh pong 1. These scripts are provided primarily to illustrate the hyperparameters used for different experiments; you will likely need to modify the arguments in these scripts to point to your data and model directories.

Data for SGI-R and SGI-E is not included due to its size, but can be re-generated locally. Contact us for details.

What does each file do?

.
├── scripts
│   ├── run.py                # The main runner script to launch jobs.
│   ├── config.yaml           # The hydra configuration file, listing hyperparameters and options.
|   └── experiments           # Configurations for various experiments done by SGI.
|   
├── src                     
│   ├── agent.py              # Implements the Agent API for action selection 
│   ├── algos.py              # Distributional RL loss and optimization
│   ├── models.py             # Forward passes, network initialization.
│   ├── networks.py           # Network architecture and forward passes.
│   ├── offline_dataset.py    # Dataloader for offline data.
│   ├── gcrl.py               # Utils for SGI's goal-conditioned RL objective.
│   ├── rlpyt_atari_env.py    # Slightly modified Atari env from rlpyt
│   ├── rlpyt_utils.py        # Utility methods that we use to extend rlpyt's functionality
│   └── utils.py              # Command line arguments and helper functions 
│
└── requirements.txt          # Dependencies

Pretraining Representations For Data-Efficient Reinforcement Learning

Related tags

Overview

Pretraining Representations For Data-Efficient Reinforcement Learning

Install

Usage:

What does each file do?

Owner

Mila

A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Line-level Handwritten Text Recognition (HTR) system implemented with TensorFlow.

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

Spatial Sparse Convolution Library

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

ObjDetApp deploys a pytorch model for object detection

A simple consistency training framework for semi-supervised image semantic segmentation

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

Neuralnetwork - Basic Multilayer Perceptron Neural Network for deep learning

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Cowsay - A rewrite of cowsay in python

Fully Convlutional Neural Networks for state-of-the-art time series classification

Repository for benchmarking graph neural networks

More than a hundred strange attractors

Config files for my GitHub profile.

Pretraining Representations For Data-Efficient Reinforcement Learning

Related tags

Overview

Pretraining Representations For Data-Efficient Reinforcement Learning

Install

Usage:

What does each file do?

Owner

Mila

A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Line-level Handwritten Text Recognition (HTR) system implemented with TensorFlow.

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

Spatial Sparse Convolution Library

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

*ObjDetApp* deploys a pytorch model for object detection

A simple consistency training framework for semi-supervised image semantic segmentation

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

Neuralnetwork - Basic Multilayer Perceptron Neural Network for deep learning

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Cowsay - A rewrite of cowsay in python

Fully Convlutional Neural Networks for state-of-the-art time series classification

Repository for benchmarking graph neural networks

More than a hundred strange attractors

Config files for my GitHub profile.

ObjDetApp deploys a pytorch model for object detection