ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Last update: Dec 28, 2022

Related tags

Overview

Status: Under development (expect bug fixes and huge updates)

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

ShinRL is an open-source JAX library specialized for the evaluation of reinforcement learning (RL) algorithms from both theoretical and practical perspectives. Please take a look at the paper for details.

QuickStart

Try ShinRL at: experiments/QuickStart.ipynb.

import gym
from shinrl import DiscreteViSolver
import matplotlib.pyplot as plt

# make an env & a config
env = gym.make("ShinPendulum-v0")
config = DiscreteViSolver.DefaultConfig(explore="eps_greedy", approx="nn", steps_per_epoch=10000)

# make mixins
mixins = DiscreteViSolver.make_mixins(env, config)
# mixins == [DeepRlStepMixIn, QTargetMixIn, TbInitMixIn, NetActMixIn, NetInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]

# (optional) arrange mixins
# mixins.insert(2, UserDefinedMixIn)

# make & run a solver
dqn_solver = DiscreteViSolver.factory(env, config, mixins)
dqn_solver.run()

# plot performance
returns = dqn_solver.scalars["Return"]
plt.plot(returns["x"], returns["y"])

# plot learned q-values  (act == 0)
q0 = dqn_solver.tb_dict["Q"][:, 0]
env.plot_S(q0, title="Learned")

# plot oracle q-values  (act == 0)
q0 = env.calc_q(dqn_solver.tb_dict["ExploitPolicy"])[:, 0]
env.plot_S(q0, title="Oracle")

# plot optimal q-values  (act == 0)
q0 = env.calc_optimal_q()[:, 0]
env.plot_S(q0, title="Optimal")

⚡ Key Modules

ShinRL consists of two main modules:

ShinEnv: Implement relatively small MDP environments with access to the oracle quantities.
Solver: Solve the environments (e.g., finding the optimal policy) with specified algorithms.

🔬 ShinEnv for Oracle Analysis

ShinEnv provides small environments with oracle methods that can compute exact quantities:
- calc_q computes a Q-value table containing all possible state-action pairs given a policy.
- calc_optimal_q computes the optimal Q-value table.
- calc_visit calculates state visitation frequency table, for a given policy.
- calc_return is a shortcut for computing exact undiscounted returns for a given policy.
Some environments support continuous action space and image observation. See the following table and shinrl/envs/__init__.py for the available environments.

Environment	Dicrete action	Continuous action	Image Observation	Tuple Observation
ShinMaze	✔️	❌	❌	✔️
ShinMountainCar-v0	✔️	✔️	✔️	✔️
ShinPendulum-v0	✔️	✔️	✔️	✔️
ShinCartPole-v0	✔️	✔️	❌	✔️

🏭 Flexible Solver by MixIn

A "mixin" is a class which defines and implements a single feature. ShinRL's solvers are instantiated by mixing some mixins.
By arranging mixins, you can easily implement your own idea on the ShinRL's code base. See experiments/QuickStart.ipynb for example.
The following code demonstrates how different mixins turn into "value iteration" and "deep Q learning":

import gym
from shinrl import DiscreteViSolver

env = gym.make("ShinPendulum-v0")

# run value iteration (dynamic programming)
config = DiscreteViSolver.DefaultConfig(approx="tabular", explore="oracle")
mixins = DiscreteViSolver.make_mixins(env, config)
# mixins == [TabularDpStepMixIn, QTargetMixIn, TbInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]
vi_solver = DiscreteViSolver.factory(env, config, mixins)
vi_solver.run()

# run deep Q learning 
config = DiscreteViSolver.DefaultConfig(approx="nn", explore="eps_greedy")
mixins = DiscreteViSolver.make_mixins(env, config)  
# mixins == [DeepRlStepMixIn, QTargetMixIn, TbInitMixIn, NetActMixIn, NetInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]
dql_solver = DiscreteViSolver.factory(env, config, mixins)
dql_solver.run()

# ShinRL also provides deep RL solvers with OpenAI Gym environment supports.
env = gym.make("CartPole-v0")
mixins = DiscreteViSolver.make_mixins(env, config)  
# mixins == [DeepRlStepMixIn, QTargetMixIn, TargetMixIn, NetActMixIn, NetInitMixIn, GymExploreMixIn, GymEvalMixIn, DiscreteViSolver]
dql_solver = DiscreteViSolver.factory(env, config, mixins)
dql_solver.run()

Installation

git clone [email protected]:omron-sinicx/ShinRL.git
cd ShinRL
pip install -e .

Test

cd ShinRL
make test

Format

cd ShinRL
make format

Docker

cd ShinRL
docker-compose up

Citation

# Neurips DRL WS 2021 version
@inproceedings{toshinori2021shinrl,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    booktitle = {Proceedings of the NeurIPS Deep RL Workshop},
}

# Arxiv version
@article{toshinori2021shinrlArxiv,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    url = {https://arxiv.org/abs/2112.04123},
    journal={arXiv preprint arXiv:2112.04123},
}

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Related tags

Overview

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

QuickStart

⚡ Key Modules

🔬 ShinEnv for Oracle Analysis

🏭 Flexible Solver by MixIn

Installation

Test

Format

Docker

Citation

Owner

Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

Only valid pull requests will be allowed. Use python only and readme changes will not be accepted.

Build Graph Nets in Tensorflow

Mask-invariant Face Recognition through Template-level Knowledge Distillation

Proof-Of-Concept Piano-Drums Music AI Model/Implementation

Bringing Computer Vision and Flutter together , to build an awesome app !!

Learning Temporal Consistency for Low Light Video Enhancement from Single Images (CVPR2021)

A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

Code for Transformer Hawkes Process, ICML 2020.

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size

Interpolation-based reduced-order models

Open AI's Python library

Decorators for maximizing memory utilization with PyTorch & CUDA

YOLOX-Paddle - A reproduction of YOLOX by PaddlePaddle

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+