Reinforcement learning for self-driving in a 3D simulation

Overview

SelfDrive_AI

Reinforcement learning for self-driving in a 3D simulation (Created using UNITY-3D)

1. Requirements for the SelfDrive_AI Gym


You need Python 3.6 or later to run the simulation. (Note: the current environment is only supported in windows) Also, you can directly interact with the simulation by clicking the exe file and then by using W,A, S and D keys.

Please follow the two links below to install Unity-Gym and Stable-Baselines. Also, you can train it using your custom reinforcement learning algorithms by following the OpenAI gym structure (https://gym.openai.com/).

Install Unity-Gym

Install Stable-Baselines3

mlagents can be installed using pip:

$ python3 -m pip install mlagents

The image below illustrates the target goal of the AIcar, where the car needs to explore all the trajectories to find the bridge first.

2. (Training) You can train the environment by using the code below which has OpenAI gym structure. It will save the training results into a log directory which you can view using tensorboard. Feel free to change the parameters inside the code

from stable_baselines3 import PPO, SAC, ppo
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel = EngineConfigurationChannel()
from gym_unity.envs import UnityToGymWrapper
from mlagents_envs.environment import UnityEnvironment
import time,os
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.policies import ActorCriticPolicy
import math


env_name = "./UnityEnv"
speed = 15


env = UnityEnvironment(env_name,seed=1, side_channels=[channel])
channel.set_configuration_parameters(time_scale =speed)
env= UnityToGymWrapper(env, uint8_visual=False) # OpenAI gym interface created using UNITY

time_int = int(time.time())

# Diretories for storing results 
log_dir = "stable_results/Euler_env_3{}/".format(time_int)
log_dirTF = "stable_results/tensorflow_log_Euler3{}/".format(time_int) 
os.makedirs(log_dir, exist_ok=True)

env = Monitor(env, log_dir, allow_early_resets=True)
env = DummyVecEnv([lambda: env])  # The algorithms require a vectorized environment to run


model = PPO(ActorCriticPolicy, env, verbose=1, tensorboard_log=log_dirTF, device='cuda')
model.learn(int(200000)) # you can change the step size
time_int2 = int(time.time()) 
print('TIME TAKEN for training',time_int-time_int2)
# # save the model
model.save("Env_model")


# # # # # LOAD FOR TESTING
# del model
model = PPO.load("Env_model")

obs = env.reset()

# Test the agent for 1000 steps after training

for i in range(400):
    action, states = model.predict(obs)
    obs, rewards, done, info = env.step(action)
    env.render()



To monitor the training progress using tensorboard you type the following command from the terminal

$ tensorboard --logdir "HERE PUT THE PATH TO THE DIRECTORY"

Glimpse from the simulation environment

3. (Testing) The following code can be used to test the trained Humanoid Agent

from stable_baselines3 import PPO, SAC, ppo
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel = EngineConfigurationChannel()
from gym_unity.envs import UnityToGymWrapper
from mlagents_envs.environment import UnityEnvironment
import time,os
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.policies import ActorCriticPolicy
import math


env_name = "./UnityEnv"
speed = 1


env = UnityEnvironment(env_name,seed=1, side_channels=[channel])
channel.set_configuration_parameters(time_scale =speed)
env= UnityToGymWrapper(env, uint8_visual=False) # OpenAI gym interface created using UNITY

time_int = int(time.time())

# Diretories for storing results
log_dir = "stable_results/Euler_env_3{}/".format(time_int)
log_dirTF = "stable_results/tensorflow_log_Euler3{}/".format(time_int)
os.makedirs(log_dir, exist_ok=True)

env = Monitor(env, log_dir, allow_early_resets=True)
env = DummyVecEnv([lambda: env])  # The algorithms require a vectorized environment to run


model = PPO.load("Env_model")

obs = env.reset()

# Test the agent for 1000 steps after training

for i in range(1000):
    action, states = model.predict(obs)
    obs, rewards, done, info = env.step(action)
    env.render()

***Note: I am still developing the project by inducing more challenging constraints.

Owner
Surajit Saikia
Roboticist | PhD in AI | Deep learning, Reinforcement learning and Computer Vision.
Surajit Saikia
3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry

SynergyNet 3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at Unive

Cho-Ying Wu 239 Jan 06, 2023
EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness

EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness Improving GAN Equilibrium by Raising Spatial Awareness Jianyuan Wang, Ceyuan Yang, Ying

GenForce: May Generative Force Be with You 149 Dec 19, 2022
Mixed Transformer UNet for Medical Image Segmentation

MT-UNet Update 2022/01/05 By another round of training based on previous weights, our model also achieved a better performance on ACDC (91.61% DSC). W

dotman 92 Dec 25, 2022
GARCH and Multivariate LSTM forecasting models for Bitcoin realized volatility with potential applications in crypto options trading, hedging, portfolio management, and risk management

Bitcoin Realized Volatility Forecasting with GARCH and Multivariate LSTM Author: Chi Bui This Repository Repository Directory ├── README.md

Chi Bui 113 Dec 29, 2022
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Kakao Brain 114 Nov 28, 2022
Configure SRX interfaces with Scrapli

Configure SRX interfaces with Scrapli Overview This example will show how to configure interfaces on Juniper's SRX firewalls. In addition to the Pytho

Calvin Remsburg 1 Jan 07, 2022
This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

GPlearn_finiance_stock_futures_extension This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector

Chengwei <a href=[email protected]"> 189 Dec 25, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
PyTorch implementation of SIFT descriptor

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022
Train a state-of-the-art yolov3 object detector from scratch!

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch This repo let's you train a custom image detector using the state-of-the-art YOLOv3 c

AntonMu 616 Jan 08, 2023
Supervised forecasting of sequential data in Python.

Supervised forecasting of sequential data in Python. Intro Supervised forecasting is the machine learning task of making predictions for sequential da

The Alan Turing Institute 54 Nov 15, 2022
CLIP + VQGAN / PixelDraw

clipit Yet Another VQGAN-CLIP Codebase This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings a

dribnet 276 Dec 12, 2022
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

BERNARD Guillaume 1 Feb 15, 2022
Generative Art Using Neural Visual Grammars and Dual Encoders

Generative Art Using Neural Visual Grammars and Dual Encoders Arnheim 1 The original algorithm from the paper Generative Art Using Neural Visual Gramm

DeepMind 231 Jan 05, 2023
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

Doron Adler 43 Oct 03, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 09, 2022
TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Authors: Yixuan Su, Fangyu Liu, Zaiqiao Meng, Lei Shu, Ehsan Shareghi, and Nig

Yixuan Su 79 Nov 04, 2022
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022
Oriented Response Networks, in CVPR 2017

Oriented Response Networks [Home] [Project] [Paper] [Supp] [Poster] Torch Implementation The torch branch contains: the official torch implementation

ZhouYanzhao 217 Dec 12, 2022
[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

EOPSN: Exemplar-Based Open-Set Panoptic Segmentation Network (CVPR 2021) PyTorch implementation for EOPSN. We propose open-set panoptic segmentation t

Jaedong Hwang 49 Dec 30, 2022