A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Overview

SpiderBot_DeepRL

Title: Implementation of Single and Multi-Agent Deep Reinforcement Learning Algorithms for a Walking Spider Robot Authors(s): Arijit Dasgupta, Chong Yu Quan

Welcome to our project! For this project, we aim to take our SpiderBot and make it walk using deep reinforcement learning. The code is written entirely in Python 3.7.7 and the following Python libraries are required for our code to work.

pybullet==3.0.6
numpy==1.18.5
matplotlib==3.3.2
tensorflow_probability==0.11.1
seaborn==0.11.0
pandas==1.1.4
tensorflow==2.3.1

Other than this, no additional software is needed for the code to work. The PyBullet Physics Engine is used for simulation using an OpenGL GUI. In this code, we have the following : -

  • A requirement.txt for required python libraries
  • SolidWorks CADs of the SpiderBot
  • SpiderBot URDFs for the SpiderBot
  • Folders for Training Logs & Plots
  • Two saved models of the SpiderBot Agent
  • Source Code for the Deep RL Implementation
  • Training Code to train the SpiderBot with Deep RL
  • Validation Code to test trained models
  • Postprocessing Code to generate plots of training

The code supports the following 5 algorithms (with their characteristics defined):

Algorithm Agent (Actor) Policy Learning Network Actions per Time-Step Action Space State Space
MAD3QN Multiple (Decentralised) Decentralised Separate Multiple Discrete Continuous
MAA2C Multiple (Decentralised) Decentralised Separate Multiple Discrete Continuous
A2CMA Single (Centralised) Decentralised Hybrid Multiple Discrete Continuous
A2CSA Single (Centralised) Centralised Hybrid Single Discrete Continuous
DDPG Single (Centralised) Centralised Separate Multiple Continuous Continuous

We will now walk through the folders and files.

Folders

SpiderBot_CADs

This folder contains all the part and assembly files for the SpiderBot. There are options for 3-legged, 4-legged, 6-legged & 8-legged SpiderBots.

SpiderBot_URDFs

This folder contains all URDF files and associated STL files for the SpiderBot. There are options for 3-legged, 4-legged, 6-legged & 8-legged SpiderBots.

Training_Logs & Training_Plots

Folders to store csv file of training data and PDF plots of training.

Saved_Models

Contains two saved models using DDPG. The FullyTrained Model (375 episodes) is able to walk well and up to 9 metres in the forward direction. The PartiallyTrained Model (50 episodes) can move forward slightly but only to a certain extent.

Source Code

SpiderBot_Environment.py

This file has the p_gym class. This uses pybullet and loads the plane environment (no obstacles) and the SpiderBot into the physics engine. The code allows an agent to retrieve state observations for a leg or whole SpiderBot and set a target velocity for joints in the SpiderBot. Finally, the code uses information from the physics engine to determine rewards for a time step.

SpiderBot_Neural_Network.py

This file has the classes for the fully-connected neural networks used. The Tensorflow 2 API is used to develop the neural networks. Depending on the algorithm and number of SpiderBot legs, the neural networks are customised for them. There is all a call method to do a forward propagation through the neural network.

SpiderBot_Agent.py

This file is a long one, which has all the operations of the agent for all 5 algorithms. It initialises the neural networks based on the algorithm in the constructor. The class also has the functionality to update the target networks for DDPG & MAD3QN. Additionally, it has a long list of methods to apply gradients for each one of the algorithms. In these methods, the TensorFlow 2 computational graph and gradient tapes are used to help in backpropagating the loss function. Finally the class also has the functionality to save all models and load all models.

SpiderBot_Replay_Buffer.py

This file contains the replay_buffer class that handles experience replay storage and operations like logging and sampling with a batch size.

SpiderBot_Walk.py

This file contains the walk function that is actually responsible for handling all training operations. This is where all the classes interact with each other. The episodes are looped through and the SpiderBot is trained. The training-related data is logged and saved as a csv into the Training_Logs folder while the best models are saved to the Saved_Models folder during training.

SpiderBot_Postprocessing.py

This file handles the plotting post-processing operations that takes the CSV file from the Training_Logs folder and saves the plot into the Training_Plots folder.

Main Code

SpiderBot_Train_Model.py

This file allows the user to set up the training session. In this file, the user can set 3 levels of configuration for training. The general config section has options for choosing algorithms, number of legs, target location, episodes etc. The Hyperparameters config section handles all hyperparameters of the entire training process. The reward structure config provides options for all the scalar rewards. The user must set all of these configs and run the file to train the SpiderBot. TIP: not using a GUI is faster for training, especially if you use a CUDA-enabled NVIDIA GPU.

SpiderBot_Validation.py

This file allows the user to validate and test a trained model, specially made for the Professors and TAs of SpiderBot to visualise our fully trained model.

How to train a model?

Unzip the SpiderBot_URDFS.zip file into the same directory. Open up SpiderBot_Train_Model.py for editing. The most important parameter is training_name that you must define. This is unique to a particular training session and all saved models, logs and plots are based on this training_name. After that set up your General Config:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~ GENERAL CONFIG ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
training_name = "insert_training_name_here"
model = "DDPG"
num_of_legs = 8 
episodes = 375
target_location = 3
use_GUI = True
do_post_process = True
save_best_model = True
save_data = True

Following that, set up the configurations for the hyperparameters:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~ HYPERPARAMETER CONFIG ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
time_step_size = 120./240
upper_angle = 60
lower_angle = -60
lr_actor = 0.00005
lr_critic = 0.0001
discount_rate = 0.9
update_target = None
tau = 0.005
max_mem_size = 1000000
batch_size = 512
max_action = 10
min_action = -10
noise = 1
epsilon = 1
epsilon_decay = 0.0001
epsilon_min = 0.01

Finally, set up the configuration for the reward structure:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~ REWARD STRUCTURE CONFIG ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
forward_motion_reward = 500
forward_distance_reward = 250
sideways_velocity_punishment = 500
sideways_distance_penalty = 250
time_step_penalty = 1
flipped_penalty = 500
goal_reward = 500
out_of_range_penalty = 500

Then run the python code

> python SpiderBot_Train_Model.py

How to Validate/Test our Models?

To test the fully trained model, just run SpiderBot_Validation.py.

> python SpiderBot_Validation.py

If you wish to run the other saved model, the partially trained one, you can open up SpiderBot_Validation.py and edit the training_name from DDPG_FullyTrained to DDPG_PartiallyTrained in the config section as shown:

#~~~~~~~~~~~~ VALIDATION CONFIG SETUP ~~~~~~~~~~~~#
training_name = "DDPG_PartiallyTrained"
model = "DDPG"
target_location = 8
episodes = 100000000000 # A large number is set to put the simulation on loop

Video Demonstration

Owner
Arijit Dasgupta
Arijit Dasgupta
Official Pytorch implementation of "DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network" (CVPR'21)

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network Pytorch implementation for our DivCo. We propose a simple ye

64 Nov 22, 2022
Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets (including obl

Azavea 1.7k Dec 22, 2022
[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Self-Damaging Contrastive Learning Introduction The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervis

VITA 51 Dec 29, 2022
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I

Simon Niklaus 985 Jan 08, 2023
This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

PFD:Pose-guided Feature Disentangling for Occluded Person Re-identification based on Transformer This repo is the official implementation of "Pose-gui

Tao Wang 93 Dec 18, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Genetic Programming in Python, with a scikit-learn inspired API

Welcome to gplearn! gplearn implements Genetic Programming in Python, with a scikit-learn inspired and compatible API. While Genetic Programming (GP)

Trevor Stephens 1.3k Jan 03, 2023
The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classifier')

The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography

James 135 Dec 23, 2022
Resources for the Ki testnet challenge

Ki Testnet Challenge This repository hosts ki-testnet-challenge. A set of scripts and resources to be used for the Ki Testnet Challenge What is the te

Ki Foundation 23 Aug 08, 2022
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

122 Dec 28, 2022
Autolfads-tf2 - A TensorFlow 2.0 implementation of Latent Factor Analysis via Dynamical Systems (LFADS) and AutoLFADS

autolfads-tf2 A TensorFlow 2.0 implementation of LFADS and AutoLFADS. Installati

Systems Neural Engineering Lab 11 Oct 29, 2022
(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

Yichun Shi 66 Nov 29, 2022
Let's create a tool to convert Thailand budget from PDF to CSV.

thailand-budget-pdf2csv Let's create a tool to convert Thailand Government Budgeting from PDF to CSV! รวมพลัง Dev แปลงงบ จาก PDF สู่ Machine-readable

Kao.Geek 88 Dec 19, 2022
Human4D Dataset tools for processing and visualization

HUMAN4D: A Human-Centric Multimodal Dataset for Motions & Immersive Media HUMAN4D constitutes a large and multimodal 4D dataset that contains a variet

tofis 15 Nov 09, 2022
Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

ZJU3DV 365 Dec 30, 2022
Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

Re-implementation of the paper 'Grokking: Generalization beyond overfitting on small algorithmic datasets' Paper Original paper can be found here Data

Tom Lieberum 38 Aug 09, 2022
Unofficial Pytorch Implementation of WaveGrad2

WaveGrad 2 — Unofficial PyTorch Implementation WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Unofficial PyTorch+Lightning Implementati

MINDs Lab 104 Nov 29, 2022
Detecting drunk people through thermal images using Deep Learning (CNN)

Drunk Detection CNN Detecting drunk people through thermal images using Deep Learning (CNN) Dataset We used thermal images provided by Electronics Lab

Giacomo Ferretti 3 Oct 27, 2022
DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates

DeepMetaHandles (CVPR2021 Oral) [paper] [animations] DeepMetaHandles is a shape deformation technique. It learns a set of meta-handles for each given

Liu Minghua 73 Dec 15, 2022
Official Code for "Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning"

CMSF Official Code for "Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning" Requirements Python = 3.7.6 PyTorch

4 Nov 25, 2022