Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Last update: Dec 26, 2022

Related tags

Deep Learning DQN-tensorflow

Overview

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

This implementation contains:

Deep Q-network and Q-learning
Experience replay memory
- to reduce the correlations between consecutive updates
Network for Q-learning targets are fixed for intervals
- to reduce the correlations between target and predicted Q-values

Requirements

Python 2.7 or Python 3.3+
gym
tqdm
SciPy or OpenCV2
TensorFlow 0.12.0

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

[1] & [2]

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

References

License

MIT License.

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Related tags

Overview

Human-Level Control through Deep Reinforcement Learning

Requirements

Usage

Results

Simple Results

Detailed Results

References

License

Owner

Devsisters Corp.

Citation Intent Classification in scientific papers using the Scicite dataset an Pytorch

Decorators for maximizing memory utilization with PyTorch & CUDA

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

The official github repository for Towards Continual Knowledge Learning of Language Models

Temporal-Relational CrossTransformers

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

QI-Q RoboMaster2022 CV Algorithm

Tightness-aware Evaluation Protocol for Scene Text Detection

JAX-based neural network library

The story of Chicken for Club Bing

Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift (ICCV 2021)

Robotics environments

Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation.

Torch implementation of SegNet and deconvolutional network

Code for training and evaluation of the model from "Language Generation with Recurrent Generative Adversarial Networks without Pre-training"

Study of human inductive biases in CNNs and Transformers.

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".