Lux AI environment interface for RLlib multi-agents

Last update: Nov 07, 2022

Related tags

Deep Learning LuxAI-RLlib

Overview

Lux AI interface to RLlib `MultiAgentsEnv`

For Lux AI Season 1 Kaggle competition.

Please let me know if you use this, I'd like to see what people build with it!

TL;DR

The only thing you need to customise is the interface class (inheriting from multilux.lux_interface.LuxDefaultInterface). The interface needs to:

Implement four "toward-agent" methods:
- observation(joint_observation, actors)
- reward(joint_reward, actors)
- done(joint_done, actors)
- info(joint_info, actors)
Implement one "toward-environment" method:
- actions(action_dict)
Manage its own actor id creation, assignment, etc. (hint citytiles don't have ids in the game engine)

Implementation diagram

Example for training

import numpy as np

# (1) Define your custom interface for (obs, reward, done, info, actions) ---
from multilux.lux_interface import LuxDefaultInterface

class MyInterface(LuxDefaultInterface):
    def observation(self, joint_obs, actors) -> dict:
        return {a: np.array([0, 0]) for a in actors}

    def reward(self, joint_reward, actors) -> dict:
        return {a: 0 for a in actors}

    def done(self, joint_done, actors) -> dict:
        return {a: True for a in actors}

    def info(self, joint_info, actors) -> dict:
        return {a: {} for a in actors}

    def actions(self, action_dict) -> list:
        return []
    
# (2) Register environment --------------------------------------------------
from ray.tune.registry import register_env
from multilux.lux_env import LuxEnv


def env_creator(env_config):
    
    configuration = env_config.get(configuration, {})
    debug = env_config.get(debug, False)
    interface = env_config.get(interface, MyInterface)
    agents = env_config.get(agents, (None, "simple_agent"))
    
    return LuxEnv(configuration, debug,
                     interface=interface,
                     agents=agents,
                     train=True)

register_env("multilux", env_creator)

# (3) Define observation and action spaces for each actor type --------------
from gym import spaces

u_obs_space = spaces.Box(low=0, high=1, shape=(2,), dtype=np.float16)
u_act_space = spaces.Discrete(2)
ct_obs_space = spaces.Box(low=0, high=1, shape=(2,), dtype=np.float16)
ct_act_space = spaces.Discrete(2)

# (4) Instantiate agent ------------------------------------------------------
import random
from ray.rllib.agents import ppo

config = {
    "env_config": {},
    "multiagent": {
        "policies": {
            # the first tuple value is None -> uses default policy
            "unit-1": (None, u_obs_space, u_act_space, {"gamma": 0.85}),
            "unit-2": (None, u_obs_space, u_act_space, {"gamma": 0.99}),
            "citytile": (None, ct_obs_space, ct_act_space, {}),
        },
        "policy_mapping_fn":
            lambda agent_id:
                "citytile"  # Citytiles always have the same policy
                if agent_id.startswith("u_")
                else random.choice(["unit-1", "unit-2"])  # Randomly choose from unit policies
    },
}

trainer = ppo.PPOTrainer(env=LuxEnv, config=config)

# (5) Train away -------------------------------------------------------------
while True:
    print(trainer.train())

See examples/training.py

See also the LuxPythonEnvGym OpenAI-gym port by @glmcdona.

Jaime Ruiz Serra

Lux AI environment interface for RLlib multi-agents

Related tags

Overview

Lux AI interface to RLlib `MultiAgentsEnv`

TL;DR

Implementation diagram

Example for training

Owner

Jaime

A list of awesome PyTorch scholarship articles, guides, blogs, courses and other resources.

The repository is for safe reinforcement learning baselines.

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

Denoising Normalizing Flow

This is the code for the paper "Motion-Focused Contrastive Learning of Video Representations" (ICCV'21).

Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

PyTorch implementation for Graph Contrastive Learning with Augmentations

Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Convex optimization for fun and profit.

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

On Evaluation Metrics for Graph Generative Models

Lava-DL, but with PyTorch-Lightning flavour

Commonsense Ability Tests

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

AAI supports interdisciplinary research to help better understand human, animal, and artificial cognition.

Image-to-Image Translation in PyTorch

Supercharging Imbalanced Data Learning WithCausal Representation Transfer

Lux AI environment interface for RLlib multi-agents

Related tags

Overview

Lux AI interface to RLlib MultiAgentsEnv

TL;DR

Implementation diagram

Example for training

Owner

Jaime

A list of awesome PyTorch scholarship articles, guides, blogs, courses and other resources.

The repository is for safe reinforcement learning baselines.

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

Denoising Normalizing Flow

This is the code for the paper "Motion-Focused Contrastive Learning of Video Representations" (ICCV'21).

Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

PyTorch implementation for Graph Contrastive Learning with Augmentations

Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Convex optimization for fun and profit.

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

On Evaluation Metrics for Graph Generative Models

Lava-DL, but with PyTorch-Lightning flavour

Commonsense Ability Tests

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

AAI supports interdisciplinary research to help better understand human, animal, and artificial cognition.

Image-to-Image Translation in PyTorch

Supercharging Imbalanced Data Learning WithCausal Representation Transfer

Lux AI interface to RLlib `MultiAgentsEnv`