Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

CryptoFrog - My First Strategy for freqtrade

Keras-1D-NN-Classifier

MLJetReconstruction - using machine learning to reconstruct jets for CMS

Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"

Simple ray intersection library similar to coldet - succedeed by libacc

A simple pygame dino game which can also be trained and played by a NEAT KI

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

Official release of MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer axriv: http://arxiv.org/abs/2112.13513

This repository lets you interact with Lean through a REPL.

Code for our ICCV 2021 Paper "OadTR: Online Action Detection with Transformers".

Implements MLP-Mixer: An all-MLP Architecture for Vision.

NeROIC: Neural Object Capture and Rendering from Online Image Collections

Multi-Joint dynamics with Contact. A general purpose physics simulator.

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)

Make a surveillance camera from your raspberry pi!

An adaptive hierarchical energy management strategy for hybrid electric vehicles

Yolov5-lite - Minimal PyTorch implementation of YOLOv5

RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth, in ICCV 2021 (oral)

keyframes-CNN-RNN(action recognition)