Model-based Reinforcement Learning Improves Autonomous Racing Performance

Overview

Racing Dreamer: Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars

In this work, we propose to learn a racing controller directly from raw Lidar observations.

The resulting policy has been evaluated on F1tenth-like tracks and then transfered to real cars.

Racing Dreamer

The free version is available on arXiv.

If you find this code useful, please reference in your paper:

@misc{brunnbauer2021modelbased,
      title={Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars}, 
      author={Axel Brunnbauer and Luigi Berducci and Andreas Brandstätter and Mathias Lechner and Ramin Hasani and Daniela Rus and Radu Grosu},
      year={2021},
      eprint={2103.04909},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

This repository is organized as follows:

  • Folder dreamer contains the code related to the Dreamer agent.
  • Folder baselines contains the code related to the Model Free algorihtms (D4PG, MPO, PPO, LSTM-PPO, SAC).
  • Folder ros_agent contains the code related to the transfer on real racing cars.
  • Folder docs contains the track maps, mechanical and general documentation.

Dreamer

"Dreamer learns a world model that predicts ahead in a compact feature space. From imagined feature sequences, it learns a policy and state-value function. The value gradients are backpropagated through the multi-step predictions to efficiently learn a long-horizon policy."

This implementation extends the original implementation of Dreamer (Hafner et al. 2019).

We refer the reader to the Dreamer website for the details on the algorithm.

Dreamer

Instructions

This code has been tested on Ubuntu 18.04 with Python 3.7.

Get dependencies:

pip install --user -r requirements.txt

Training

We train Dreamer on LiDAR observations and propose two Reconstruction variants: LiDAR and Occupancy Map.

Reconstruction Variants

Train the agent with LiDAR reconstruction:

python dreamer/dream.py --track columbia --obs_type lidar

Train the agent with Occupancy Map reconstruction:

python dream.py --track columbia --obs_type lidar_occupancy

Please, refer to dream.py for the other command-line arguments.

Offline Evaluation

The evaluation module runs offline testing of a trained agent (Dreamer, D4PG, MPO, PPO, SAC).

To run evaluation, assuming to have the dreamer directory in the PYTHONPATH:

python evaluations/run_evaluation.py --agent dreamer \
                                     --trained_on austria \
                                     --obs_type lidar \
                                     --checkpoint_dir logs/checkpoints \
                                     --outdir logs/evaluations \
                                     --eval_episodes 10 \
                                     --tracks columbia barcelona 

The script will look for all the checkpoints with pattern logs/checkpoints/austria_dreamer_lidar_* The checkpoint format depends on the saving procedure (pkl, zip or directory).

The results are stored as tensorflow logs.

Plotting

The plotting module containes several scripts to visualize the results, usually aggregated over multiple experiments.

To plot the learning curves:

python plotting/plot_training_curves.py --indir logs/experiments \
                                                --outdir plots/learning_curves \
                                                --methods dreamer mpo \
                                                --tracks austria columbia treitlstrasse_v2 \
                                                --legend

It will produce the comparison between Dreamer and MPO on the tracks Austria, Columbia, Treitlstrasse_v2.

To plot the evaluation results:

python plotting/plot_test_evaluation.py --indir logs/evaluations \
                                                --outdir plots/evaluation_charts \
                                                --methods dreamer mpo \
                                                --vis_tracks austria columbia treitlstrasse_v2 \
                                                --legend

It will produce the bar charts comparing Dreamer and MPO evaluated in Austria, Columbia, Treitlstrasse_v2.

Instructions with Docker

We also provide an docker image based on tensorflow:2.3.1-gpu. You need nvidia-docker to run them, see here for more details.

To build the image:

docker build -t dreamer .

To train Dreamer within the container:

docker run -u $(id -u):$(id -g) -v $(pwd):/src --gpus all --rm dreamer python dream.py --track columbia --steps 1000000

Model Free

The organization of Model-Free codebase is similar and we invite the users to refer to the README for the detailed instructions.

Hardware

The codebase for the implementation on real cars is contained in ros_agent.

Additional material:

  • Folder docs/maps contains a collection of several tracks to be used in F1Tenth races.
  • Folder docs/mechanical contains support material for real world race-tracks.
Owner
Cyber Physical Systems - TU Wien
Cyber Physical Systems - TU Wien
Sematic-Segmantation - Semantic Segmentation on MIT ADE20K dataset in PyTorch

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch impleme

Berat Eren Terzioğlu 4 Mar 22, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and i

yifan liu 147 Dec 03, 2022
sense-py-AnishaBaishya created by GitHub Classroom

Compute Statistics Here we compute statistics for a bunch of numbers. This project uses the unittest framework to test functionality. Pass the tests T

1 Oct 21, 2021
Chinese clinical named entity recognition using pre-trained BERT model

Chinese clinical named entity recognition (CNER) using pre-trained BERT model Introduction Code for paper Chinese clinical named entity recognition wi

Xiangyang Li 109 Dec 14, 2022
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification

IAUnet This repository contains the code for the paper: IAUnet: Global Context-Aware Feature Learning for Person Re-Identification Ruibing Hou, Bingpe

30 Jul 14, 2022
Deep Q-Learning Network in pytorch (not actively maintained)

pytoch-dqn This project is pytorch implementation of Human-level control through deep reinforcement learning and I also plan to implement the followin

Hung-Tu Chen 342 Jan 01, 2023
Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

The Official Implementation of CLIB (Continual Learning for i-Blurry) Online Continual Learning on Class Incremental Blurry Task Configuration with An

NAVER AI 34 Oct 26, 2022
Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022
Pytorch and Torch testing code of CartoonGAN

CartoonGAN-Test-Pytorch-Torch Pytorch and Torch testing code of CartoonGAN [Chen et al., CVPR18]. With the released pretrained models by the authors,

Yijun Li 642 Dec 27, 2022
基于Paddlepaddle复现yolov5,支持PaddleDetection接口

PaddleDetection yolov5 https://github.com/Sharpiless/PaddleDetection-Yolov5 简介 PaddleDetection飞桨目标检测开发套件,旨在帮助开发者更快更好地完成检测模型的组建、训练、优化及部署等全开发流程。 PaddleD

36 Jan 07, 2023
Makes patches from huge resolution .svs slide files using openslide

openslide_patcher Makes patches from huge resolution .svs slide files using openslide Example collage I made from outputs:

2 Dec 23, 2021
OCR-D wrapper for detectron2 based segmentation models

ocrd_detectron2 OCR-D wrapper for detectron2 based segmentation models Introduction Installation Usage OCR-D processor interface ocrd-detectron2-segm

Robert Sachunsky 13 Dec 06, 2022
Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Official Code Implementation of The Paper : XAI for Transformers: Better Explanations through Conservative Propagation For the SST-2 and IMDB expermin

Ameen Ali 23 Dec 30, 2022
Predicting Tweet Sentiment Maching Learning and streamlit

Predicting-Tweet-Sentiment-Maching-Learning-and-streamlit (I prefere using Visual Studio Code ) Open the folder in VS Code Run the first cell in requi

1 Nov 20, 2021
dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ)

dualFace dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ) We provide python implementations for our CVM 2021 paper "dualFac

Haoran XIE 46 Nov 10, 2022
A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.

Face Recognition GUI This repository is a GUI version of Face Recognition by Adam Geitgey, where e.g. Docker and Tkinter are utilized. All the materia

Kasper Henriksen 6 Dec 05, 2022
Reporting and Visualization for Hazardous Events

Reporting and Visualization for Hazardous Events

Jv Kyle Eclarin 2 Oct 03, 2021
Playable Video Generation

Playable Video Generation Playable Video Generation Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci Paper: ArX

Willi Menapace 136 Dec 31, 2022
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 07, 2022
Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System This repository contains code for the paper Schultheis,

2 Oct 28, 2022