The Unsupervised Reinforcement Learning Benchmark (URLB)

Overview

The Unsupervised Reinforcement Learning Benchmark (URLB)

URLB provides a set of leading algorithms for unsupervised reinforcement learning where agents first pre-train without access to extrinsic rewards and then are finetuned to downstream tasks.

Requirements

We assume you have access to a GPU that can run CUDA 10.2 and CUDNN 8. Then, the simplest way to install all required dependencies is to create an anaconda environment by running

conda env create -f conda_env.yml

After the instalation ends you can activate your environment with

conda activate urlb

Implemented Agents

Agent Command Implementation Author(s) Paper
ICM agent=icm Denis paper
ProtoRL agent=proto Denis paper
DIAYN agent=diayn Misha paper
APT(ICM) agent=icm_apt Hao, Kimin paper
APT(Ind) agent=ind_apt Hao, Kimin paper
APS agent=aps Hao, Kimin paper
SMM agent=smm Albert paper
RND agent=rnd Kevin paper
Disagreement agent=disagreement Catherine paper

Available Domains

We support the following domains.

Domain Tasks
walker stand, walk, run, flip
quadruped walk, run, stand, jump
jaco reach_top_left, reach_top_right, reach_bottom_left, reach_bottom_right

Domain observation mode

Each domain supports two observation modes: states and pixels.

Model Command
states obs_type=states
pixels obs_type=pixels

Instructions

Pre-training

To run pre-training use the pretrain.py script

python pretrain.py agent=icm domain=walker

or, if you want to train a skill-based agent, like DIAYN, run:

python pretrain.py agent=diayn domain=walker

This script will produce several agent snapshots after training for 100k, 500k, 1M, and 2M frames. The snapshots will be stored under the following directory:

./pretrained_models/<obs_type>/<domain>/<agent>/

For example:

./pretrained_models/states/walker/icm/

Fine-tuning

Once you have pre-trained your method, you can use the saved snapshots to initialize the DDPG agent and fine-tune it on a downstream task. For example, let's say you have pre-trained ICM, you can fine-tune it on walker_run by running the following command:

python finetune.py pretrained_agent=icm task=walker_run snapshot_ts=1000000 obs_type=states

This will load a snapshot stored in ./pretrained_models/states/walker/icm/snapshot_1000000.pt, initialize DDPG with it (both the actor and critic), and start training on walker_run using the extrinsic reward of the task.

For methods that use skills, include the agent, and the reward_free tag to false.

python finetune.py pretrained_agent=smm task=walker_run snapshot_ts=1000000 obs_type=states agent=smm reward_free=false

Monitoring

Logs are stored in the exp_local folder. To launch tensorboard run:

tensorboard --logdir exp_local

The console output is also available in a form:

| train | F: 6000 | S: 3000 | E: 6 | L: 1000 | R: 5.5177 | FPS: 96.7586 | T: 0:00:42

a training entry decodes as

F  : total number of environment frames
S  : total number of agent steps
E  : total number of episodes
R  : episode return
FPS: training throughput (frames per second)
T  : total training time
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.

Data Efficient Language-Supervised Zero-Shot Recognition with Optimal Transport Distillation This repository contains PyTorch evaluation code, trainin

Meta Research 45 Dec 20, 2022
Repository of continual learning papers

Continual learning paper repository This repository contains an incomplete (but dynamically updated) list of papers exploring continual learning in ma

29 Jan 05, 2023
Material related to the Principles of Cloud Computing course.

CloudComputingCourse Material related to the Principles of Cloud Computing course. This repository comprises material that I use to teach my Principle

Aniruddha Gokhale 15 Dec 02, 2022
GAN example for Keras. Cuz MNIST is too small and there should be something more realistic.

Keras-GAN-Animeface-Character GAN example for Keras. Cuz MNIST is too small and there should an example on something more realistic. Some results Trai

160 Sep 20, 2022
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 04, 2023
AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022
Text-Based Ideal Points

Text-Based Ideal Points Source code for the paper: Text-Based Ideal Points by Keyon Vafa, Suresh Naidu, and David Blei (ACL 2020). Update (June 29, 20

Keyon Vafa 37 Oct 09, 2022
How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

AdamBNN This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021. In this

Zechun Liu 47 Sep 20, 2022
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment

Arch-Net: Model Distillation for Architecture Agnostic Model Deployment The official implementation of Arch-Net: Model Distillation for Architecture A

MEGVII Research 22 Jan 05, 2023
An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py

PYPARSVD This implementation allows for a singular value decomposition which is: Distributed using MPI4Py Streaming - data can be shown in batches to

Romit Maulik 44 Dec 31, 2022
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

hhhusiyi 96 Dec 22, 2022
Pytorch implementation of MixNMatch

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation [Paper] Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Le

910 Dec 30, 2022
Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

nerf_pl Update: an improved NSFF implementation to handle dynamic scene is open! Update: NeRF-W (NeRF in the Wild) implementation is added to nerfw br

AI葵 1.8k Dec 30, 2022
Unofficial Tensorflow Implementation of ConvNeXt from A ConvNet for the 2020s

Tensorflow Implementation of "A ConvNet for the 2020s" This is the unofficial Tensorflow Implementation of ConvNeXt from "A ConvNet for the 2020s" pap

DK 11 Oct 12, 2022
Easily pull telemetry data and create beautiful visualizations for analysis.

This repository is a work in progress. Anything and everything is subject to change. Porpo Table of Contents Porpo Table of Contents General Informati

Ryan Dawes 33 Nov 30, 2022
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 04, 2023
CMP 414/765 course repository for Spring 2022 semester

CMP414/765: Artificial Intelligence Spring2021 This is the GitHub repository for course CMP 414/765: Artificial Intelligence taught at The City Univer

ch00226855 4 May 16, 2022
[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

Balanced MSE Code for the paper: Balanced MSE for Imbalanced Visual Regression Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu CVPR 2022 (Oral) News

Jiawei Ren 267 Jan 01, 2023
Reinforcement Learning Theory Book (rus)

Reinforcement Learning Theory Book (rus)

qbrick 206 Nov 27, 2022