A working implementation of the Categorical DQN (Distributional RL).

Last update: Sep 20, 2022

Overview

Categorical DQN.

Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning.

Thanks to @tudor-berariu for optimisation and training tricks and for catching two nasty bugs.

Dependencies

You can take a look in the env export file for the full list of dependencies.

Install the game of Catch:

git clone https://github.com/floringogianu/gym_fast_envs
cd gym_fast_envs

pip install -r requirements.txt
pip install -e .

Install visdom for reporting: pip install visdom.

Training

First start the visdom server: python -m visdom.server. If you don't want to install or use visdom make sure you deactivate the display_plots option in the configs.

Train the Categorical DQN with python main.py -cf configs/catch_categorical.yaml.

Train a DQN baseline with python main.py -cf configs/catch_dqn.yaml.

To Do

Migrate to Pytorch 0.2.0. Breaks compatibility with 0.1.12.
Add some training curves.
Run on Atari.
Add proper evaluation.

Results

First row is with batch size of 64, the second with 32. Will run on more seeds and average for a better comparison. Working on adding Atari results.

A working implementation of the Categorical DQN (Distributional RL).

Related tags

Overview

Categorical DQN.

Dependencies

Training

To Do

Results

Owner

Florin Gogianu

Computational Pathology Toolbox developed by TIA Centre, University of Warwick.

Pytorch implementation of "Geometrically Adaptive Dictionary Attack on Face Recognition" (WACV 2022)

Production First and Production Ready End-to-End Speech Recognition Toolkit

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Individual Treatment Effect Estimation

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

Relaxed-machines - explorations in neuro-symbolic differentiable interpreters

Official Implementation of Few-shot Visual Relationship Co-localization

a project for 3D multi-object tracking

Improving XGBoost survival analysis with embeddings and debiased estimators

cisip-FIRe - Fast Image Retrieval

Prototype for Baby Action Detection and Classification

Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

This repo will contain code to reproduce and build upon understanding transfer learning

code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

deep_image_prior_extension

Anagram Generator in Python