MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Last update: Dec 24, 2022

Overview

Applied Reinforcement Learning with Python

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.

This is a preliminary, non-stable release of Maze. It is not yet complete and not all of our interfaces have settled yet. Hence, there might be some breaking changes on our way towards the first stable release.

Spotlight Features

Below we list a few selected Maze features.

Design and visualize your policy and value networks with the Perception Module. It is based on PyTorch and provides a large variety of neural network building blocks and model styles. Quickly compose powerful representation learners from building blocks such as: dense, convolution, graph convolution and attention, recurrent architectures, action- and observation masking, self-attention etc.
Create the conditions for efficient RL training without writing boiler plate code, e.g. by supporting best practices like pre-processing and normalizing your observations.
Maze supports advanced environment structures reflecting the requirements of real-world industrial decision problems such as multi-step and multi-agent scenarios. You can of course work with existing Gym-compatible environments.
Use the provided Maze trainers (A2C, PPO, Impala, SAC, Evolution Strategies), which are supporting dictionary action and observation spaces as well as multi-step (auto-regressive policies) training. Or stick to your favorite tools and trainers by combining Maze with other RL frameworks.
Out of the box support for advanced training workflows such as imitation learning from teacher policies and policy fine-tuning.
Keep even complex application and experiment configuration manageable with the Hydra Config System.

Get Started

Make sure PyTorch is installed and then get the latest released version of Maze as follows
```
pip install -U maze-rl

# optionally install RLLib if you want to use it in combination with Maze
pip install ray[rllib] tensorflow  
```
Read more about other options like the installation of the latest development version.

⚡ We encourage you to start with Python 3.7, as many popular environments like Atari or Box2D can not easily be installed in newer Python environments. Maze itself supports newer Python versions, but for Python 3.9 you might have to install additional binary dependencies manually
To see Maze in action check out a first example.
For a more applied introduction visit the step by step tutorial.

Installation

First Example

Step by Step Tutorial

Documentation

Learn more about Maze

The documentation is the starting point to learn more about the underlying concepts, but most importantly also provides code snippets and minimum working examples to get you started quickly.

The Workflow section guides you through typical tasks in a RL project
Policy and Value Networks introduces you to the Perception Module, how to customize action spaces and the underlying action probability distributions and two styles of policy and value networks construction:
- Template models are composed directly from an environment's observation and action space, allowing you to train with suitable agent networks on a new environment within minutes.
- Custom models gives you the full flexibility of application specific models, either with the provided Maze building blocks or directly with PyTorch.
Learn more about core concepts and structures such as the Maze environment hierarchy, the Maze event system providing a convenient way to collect statistics and KPIs, enable flexible reward formulation and supporting offline analysis.
Structured Environments and Action Masking introduces you to a general concept, which can greatly improve the performance of the trained agents in practical RL problems.

License

Maze is freely available for research and non-commercial use. A commercial license is available, if interested please contact us on our company website or write us an email.

We believe in Open Source principles and aim at transitioning Maze to a commercial Open Source project, releasing larger parts of the framework under a permissive license in the near future.

Comments

Configuration problems in the step-by-step tutorial
I've just been trying out maze and tried out the step-by-step tutorial.

In Step 5 (5. Training the MazeEnv) the instructions are incomplete or wrong.

I was able to get it running in the end, but it took (us) quite some time. I'm not sure if this is a bug in maze or hydra, of if just some newer version of either library changes the behavior a little bit. But you should update the documentation such that it works out of the box for new users of the library.

The setup (under Ubuntu 2020.04):

>> mkdir maze5 && cd maze5 >> pyenv local 3.8.8 >> python -m venv .venv >> source .venv/bin/activate >> pip install maze-rl torch >> pip list Package Version ----------------------- ----------- hydra-core 1.1.0 hydra-nevergrad-sweeper 1.1.5 maze-rl 0.1.7 torch 1.9.0 ...

Then just copy-pasted the files from the https://github.com/enlite-ai/maze-examples/tree/main/tutorial_maze_env/part03_maze_env repo and adjusted the _target paths in the config yamls (e.g. from _target_: tutorial_maze_env.part03_maze_env.env.maze_env.maze_env_factory to _target_: env.maze_env.maze_env_factory).

Problem 1:

When you run the suggested training command, Hydra will just complain that it can't find the configuration files.

>> maze-run -cn conf_train env=tutorial_cutting_2d_basic wrappers=tutorial_cutting_2d_basic \ model=tutorial_cutting_2d_basic algorithm=ppo In 'conf_train': Could not find 'model/tutorial_cutting_2d_basic' Available options in 'model': flatten_concat flatten_concat_shared_embedding pixel_obs pixel_obs_rnn rllib vector_obs vector_obs_rnn Config search path: provider=hydra, path=pkg://hydra.conf provider=main, path=pkg://maze.conf provider=schema, path=structured://

Fix:

You can just define the config directory for hydra with maze-run -cd conf -cn conf_train .... Then Hydra will find the 3 config files and load them correctly.

Problem 2:

After loading the config files, hydra tries to load the modules defined in the _target fields. And that fails immediatly with:

... File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 104, in _resolve_target return _locate(target) File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/utils.py", line 563, in _locate raise ImportError(f"Error loading module '{path}'") from e ImportError: Error loading module 'env.maze_env.maze_env_factory'

Fix:

For some reason Hydra doesn't know the path to the directory from where we call maze-run. And therefore it doesn't find the env directory containing the maze_env file.

This is fixable by just setting the environment variable: export PYTHONPATH="$PYTHONPATH:$PWD/".
bug documentation
opened by jakobkogler 2
Hello from Hydra :)

Thanks for using Hydra! I see that you are using Hydra 1.1 already which is great. One thing that is really recent is the ability to configure the config searchpath from the primary config. You can learn about it here.

This can probably eliminate the need of your users to even know what a ConfigSearchpathPlugin is.

Feel free to jump into the Hydra chat if you have any questions.

opened by omry 2
Version 0.1.7
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API
opened by enliteai 0
Version 0.1.6
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simpified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub
opened by md-enlite 0
Version 0.1.5
Features:

Adds documentation for run_context

Changes of simulated environment interfaces step_without_observation -> fast_step

Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

added value transformations
opened by md-enlite 0
Towards Version 0.1.5
Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images
opened by md-enlite 0
Release Version 0.1.4
improved docs

switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

added testing dependencies to main package
opened by enliteai 0
Dev
adds PointNetFeatureBlock to perception module

adds Tensorboard hyper paramter visualization for hydra multiruns

merges parallel and sequential dataset into a single InMemoryDataset
opened by md-enlite 0
Version 0.1.3
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation
opened by enliteai 0
Version 0.1.2
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.
opened by enliteai 0
Dev
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

Fixes:

cumulative stats logging
opened by md-enlite 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2022)
New graph neural network building blocks (message passing based on torch-scatter in addition to existing graph convolutions)

Support for action recording, replay from pre-computed action records and feature collection.

Improved wrapper hierarchy semantics: Previously values were assigned to the outermost wrapper. Now values are assigned to existing attributes by traversing the wrapper hierarchy.

Removal of deprecated modules (APIContext and Maze models for RLlib)

Reflecting changes in upstream dependencies (Gym version pinned to <0.23)

Source code(tar.gz)
Source code(zip)
v0.1.8(Dec 13, 2021)
New Features

Agent Deployment Workflow

Soft Actor Critic from Demonstrations (SACfD)

Locally Distributed ES Runner

SpacesRecordingWrapper: Records and dumps processed trajectories to pickle files

Fixes event logging for environment resets and policy events

Source code(tar.gz)
Source code(zip)
submission_22-08-25-14-06.1.zip(252.75 MB)
v0.1.7(Jun 24, 2021)
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API

Compatibility with PyTorch 1.9

Source code(tar.gz)
Source code(zip)
v0.1.6(Jun 14, 2021)
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simplified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub

Source code(tar.gz)
Source code(zip)
v0.1.5(May 20, 2021)
Features:

adds RunContext (Maze Python API)

adds seeding to environments, models and trainers

changes of simulated environment interfaces step_without_observation -> fast_step

Improvements:

adds an ExportGifWrapper

adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

adds value transformations

Source code(tar.gz)
Source code(zip)
v0.1.4(Apr 29, 2021)
switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

interfaces support collaborative multi-agent actor critic

improved docs

added testing dependencies to main package

Source code(tar.gz)
Source code(zip)
v0.1.3(Apr 1, 2021)
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation

Source code(tar.gz)
Source code(zip)
v0.1.2(Mar 25, 2021)
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.

Source code(tar.gz)
Source code(zip)
v0.1.1(Mar 18, 2021)
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

adds MazeEnvMonitoringWrapper as a default to wrapper stacks

Fixes:

cumulative stats logging

Source code(tar.gz)
Source code(zip)
v0.1.0(Mar 11, 2021)
Documentation updates:

Integrating existing Gym environments

Factory documentation

Experiments workflow, ...

Updated to Hydra 1.1.0:

Using Hydra.instantiate instead of custom registry implementation

Added Rollout evaluator
Source code(tar.gz)
Source code(zip)

Owner

EnliteAI GmbH

enliteAI is a machine learning company, developing the Reinforcement Learning framework Maze.

GitHub Repository https://maze-rl.readthedocs.io/

PyTorch implementation of federated learning framework based on the acceleration of global momentum

Federated Learning with Acceleration of Global Momentum PyTorch implementation of federated learning framework based on the acceleration of global mom

0 Dec 23, 2021

ADOP: Approximate Differentiable One-Pixel Point Rendering

ADOP: Approximate Differentiable One-Pixel Point Rendering Abstract: We present a novel point-based, differentiable neural rendering pipeline for scen

1.9k Jan 06, 2023

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

Investigating U-NETS With Various Intermediate Blocks For Spectrogram-based Singing Voice Separation A Pytorch Implementation of the paper "Investigat

63 Nov 14, 2022

Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022

Local Attention - Flax module for Jax

Local Attention - Flax Autoregressive Local Attention - Flax module for Jax Install $ pip install local-attention-flax Usage from jax import random fr

16 Jun 16, 2022

Official git for "CTAB-GAN: Effective Table Data Synthesizing"

CTAB-GAN This is the official git paper CTAB-GAN: Effective Table Data Synthesizing. The paper is published on Asian Conference on Machine Learning (A

30 Dec 26, 2022

Finetuning Pipeline

KLUE Baseline Korean(한국어) KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark. See our paper fo

74 Dec 13, 2022

Code for the TPAMI paper: "Syntax Customized Video Captioning by Imitating Exemplar Sentences"

Syntax-Customized-Video-Captioning Code for the TPAMI paper: "Syntax Customized Video Captioning by Imitating Exemplar Sentences". This is my second w

3 Dec 05, 2022

Robust Partial Matching for Person Search in the Wild

APNet for Person Search Introduction This is the code of Robust Partial Matching for Person Search in the Wild accepted in CVPR2020. The Align-to-Part

36 Dec 18, 2022

Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

97 Jan 03, 2023

A framework for multi-step probabilistic time-series/demand forecasting models

JointDemandForecasting.py A framework for multi-step probabilistic time-series/demand forecasting models File stucture JointDemandForecasting contains

3 Sep 28, 2022

HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

HugsVision is an open-source and easy to use all-in-one huggingface wrapper for computer vision. The goal is to create a fast, flexible and user-frien

166 Nov 27, 2022

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

97 Dec 23, 2022

Implementation of Common Image Evaluation Metrics by Sayed Nadim (sayednadim.github.io). The repo is built based on full reference image quality metrics such as L1, L2, PSNR, SSIM, LPIPS. and feature-level quality metrics such as FID, IS. It can be used for evaluating image denoising, colorization, inpainting, deraining, dehazing etc. where we have access to ground truth.

Image Quality Evaluation Metrics Implementation of some common full reference image quality metrics. The repo is built based on full reference image q

10 Jan 01, 2023

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Related tags

Overview

Applied Reinforcement Learning with Python

Spotlight Features

Get Started

Learn more about Maze

License

Comments

Releases(v0.2.0)

v0.2.0(Nov 21, 2022)

v0.1.8(Dec 13, 2021)

v0.1.7(Jun 24, 2021)

v0.1.6(Jun 14, 2021)

v0.1.5(May 20, 2021)

v0.1.4(Apr 29, 2021)

v0.1.3(Apr 1, 2021)

v0.1.2(Mar 25, 2021)

v0.1.1(Mar 18, 2021)

v0.1.0(Mar 11, 2021)