Implementation of MA-Trace - a general-purpose multi-agent RL algorithm for cooperative environments.

Related tags

Deep Learningseed_rl
Overview

Off-Policy Correction For Multi-Agent Reinforcement Learning

This repository is the official implementation of Off-Policy Correction For Multi-Agent Reinforcement Learning. It is based on SEED RL, commit 5f07ba2a072c7a562070b5a0b3574b86cd72980f.

Requirements

Execution of our code is done within Docker container, you must install Docker according to the instructions provided by the authors. The specific requirements for our project are prepared as dockerfile (docker/Dockerfile.starcraft) and installed inside a container during the first execution of running script. Before running training, firstly build its base image by running:

./docker_base/marlgrid/docker/build_base.sh

Note that to execute docker commands you may need to use sudo or install Docker in rootless mode.

Training

To train a MA-Trace model, run the following command:

./run_local.sh starcraft vtrace [nb of actors] [configuration]

The [nb of actors] specifies the number of workers used for training, should be a positive natural number.

The [configuration] specifies the hyperparameters of training.

The most important hyperparameters are:

  • learning_rate the learning rate
  • entropy_cost initial entropy cost
  • target_entropy final entropy cost
  • entropy_cost_adjustment_speed how fast should entropy cost be adjusted towards the final value
  • frames_stacked the number of stacked frames
  • batch_size the size of training batches
  • discounting the discount factor
  • full_state_critic whether to use full state as input to critic network, set False to use only agents' observations
  • is_centralized whether to perform centralized or decentralized training
  • task_name name of the SMAC task to train on, see the section below

There are other parameters to configure, listed in the files, though of minor importance.

The running script provides evaluation metrics during training. They are displayed using tmux, consider checking the navigation controls.

For example, to use default parameters and one actor, run:

./run_local.sh starcraft vtrace 1 ""

To train the algorithm specified in the paper:

  • MA-Trace (obs): ./run_local.sh starcraft vtrace 1 "--full_state_critic=False"
  • MA-Trace (full): ./run_local.sh starcraft vtrace 1 "--full_state_critic=True"
  • DecMa-Trace: ./run_local.sh starcraft vtrace 1 "--is_centralized=False"
  • MA-Trace (obs) with 3 stacked observations: ./run_local.sh starcraft vtrace 1 "--full_state_critic=False --frames_stacked=3"
  • MA-Trace (full) with 4 stacked observations: ./run_local.sh starcraft vtrace 1 "--full_state_critic=True --frames_stacked=4"

Note that to match the perforance presented in the paper it is required to use higher number of actors, e.g. 20.

StarCraft Multi-Agent Challange

We evaluate our models on the StarCraft Multi-Agent Challange benchmark (latest version, i.e. 4.10). The challange consists of 14 tasks: '2s_vs_1sc', '2s3z', '3s5z', '1c3s5z', '10m_vs_11m', '2c_vs_64zg', 'bane_vs_bane', '5m_vs_6m', '3s_vs_5z', '3s5z_vs_3s6z', '6h_vs_8z', '27m_vs_30m', 'MMM2' and 'corridor'.

To train on a chosen task, e.g. 'MMM2', add --task_name='MMM2' to configuration, e.g.

./run_local.sh starcraft vtrace 1 "--full_state_critic=False --task_name='MMM2'"

Results

Our model achieves the following performance on SMAC:

results.png

PyTorch implementation of the Pose Residual Network (PRN)

Pose Residual Network This repository contains a PyTorch implementation of the Pose Residual Network (PRN) presented in our ECCV 2018 paper: Muhammed

Salih Karagoz 289 Nov 28, 2022
Code for STFT Transformer used in BirdCLEF 2021 competition.

STFT_Transformer Code for STFT Transformer used in BirdCLEF 2021 competition. The STFT Transformer is a new way to use Transformers similar to Vision

Jean-François Puget 69 Sep 29, 2022
Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

EarthGAN - Earth Mantle Surrogate Modeling Can a surrogate model of the Earth’s Mantle Convection data set be built such that it can be readily run in

Tim 0 Dec 09, 2021
Code and hyperparameters for the paper "Generative Adversarial Networks"

Generative Adversarial Networks This repository contains the code and hyperparameters for the paper: "Generative Adversarial Networks." Ian J. Goodfel

Ian Goodfellow 3.5k Jan 08, 2023
Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

Jack Parker-Holder 22 Nov 16, 2022
Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

Neural Distance Embeddings for Biological Sequences Official implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTo

Gabriele Corso 56 Dec 23, 2022
10x faster matrix and vector operations

Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo

2.3k Jan 09, 2023
natural image generation using ConvNets

The Eyescream Project Generating Natural Images using Neural Networks. For our research summary on this work, please read the Arxiv paper: http://arxi

Meta Archive 601 Nov 23, 2022
git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

USD-Seg This project is an implement of paper USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation, based on FCOS detector f

Ruolin Ye 80 Nov 28, 2022
Federated Learning Based on Dynamic Regularization

Federated Learning Based on Dynamic Regularization This is implementation of Federated Learning Based on Dynamic Regularization. Requirements Please i

39 Jan 07, 2023
Playable Video Generation

Playable Video Generation Playable Video Generation Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci Paper: ArX

Willi Menapace 136 Dec 31, 2022
Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

GLAT Implementation for the ACL2021 paper "Glancing Transformer for Non-Autoregressive Neural Machine Translation" Requirements Python = 3.7 Pytorch

117 Jan 09, 2023
DLWP: Deep Learning Weather Prediction

DLWP: Deep Learning Weather Prediction DLWP is a Python project containing data-

Kushal Shingote 3 Aug 14, 2022
Detectron2-FC a fast construction platform of neural network algorithm based on detectron2

What is Detectron2-FC Detectron2-FC a fast construction platform of neural network algorithm based on detectron2. We have been working hard in two dir

董晋宗 9 Jun 06, 2022
[Arxiv preprint] Causality-inspired Single-source Domain Generalization for Medical Image Segmentation (code&data-processing pipeline)

Causality-inspired Single-source Domain Generalization for Medical Image Segmentation Arxiv preprint Repository under construction. Might still be bug

Cheng 31 Dec 27, 2022
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

63 Oct 17, 2022
Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Explainability Requires Interactivity This repository contains the code to train all custom models used in the paper Explainability Requires Interacti

Digital Health & Machine Learning 5 Apr 07, 2022
FluidNet re-written with ATen tensor lib

fluidnet_cxx: Accelerating Fluid Simulation with Convolutional Neural Networks. A PyTorch/ATen Implementation. This repository is based on the paper,

JoliBrain 50 Jun 07, 2022
2021 National Underwater Robotics Vision Optics

2021-National-Underwater-Robotics-Vision-Optics 2021年全国水下机器人算法大赛-光学赛道-B榜精度第18名 (Kilian_Di的团队:A榜[email pro

Di Chang 9 Nov 04, 2022
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

郝翔 357 Jan 04, 2023