Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Overview

Legged Robots that Keep on Learning

Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, which contains code for training a simulated or real A1 quadrupedal robot to imitate various reference motions, pre-trained policies, and example training code for learning the policies.

animated

Project page: https://sites.google.com/berkeley.edu/fine-tuning-locomotion

Getting Started

  • Install MPC extension (Optional) python3 setup.py install --user

Install dependencies:

  • Install MPI: sudo apt install libopenmpi-dev
  • Install requirements: pip3 install -r requirements.txt

Training Policies in Simulation

To train a policy, run the following command:

python3 motion_imitation/run_sac.py \
--mode train \
--motion_file [path to reference motion, e.g., motion_imitation/data/motions/pace.txt] \
--int_save_freq 1000 \
--visualize
  • --mode can be either train or test.
  • --motion_file specifies the reference motion that the robot is to imitate (not needed for training a reset policy). motion_imitation/data/motions/ contains different reference motion clips.
  • --int_save_freq specifies the frequency for saving intermediate policies every n policy steps.
  • --visualize enables visualization, and rendering can be disabled by removing the flag.
  • --train_reset trains a reset policy, otherwise imitation policies will be trained according to the reference motions passed in.
  • adding --use_redq uses REDQ, otherwise vanilla SAC will be used.
  • the trained model, videos, and logs will be written to output/.

Evaluating and/or Fine-Tuning Trained Policies

We provide checkpoints for the pre-trained models used in our experiments in motion_imitation/data/policies/.

Evaluating a Policy in Simulation

To evaluate individual policies, run the following command:

python3 motion_imitation/run_sac.py \
--mode test \
--motion_file [path to reference motion, e.g., motion_imitation/data/motions/pace.txt] \
--model_file [path to imitation model checkpoint, e.g., motion_imitation/data/policies/pace.ckpt] \
--num_test_episodes [# episodes to test] \
--use_redq \
--visualize
  • --motion_file specifies the reference motion that the robot is to imitate motion_imitation/data/motions/ contains different reference motion clips.
  • --model_file specifies specifies the .ckpt file that contains the trained model motion_imitation/data/policies/ contains different pre-trained models.
  • --num_test_episodes specifies the number of episodes to run evaluation for
  • --visualize enables visualization, and rendering can be disabled by removing the flag.

Autonomous Training using a Pre-Trained Reset Controller

To fine-tune policies autonomously, add a path to a trained reset policy (e.g., motion_imitation/data/policies/reset.ckpt) and a (pre-trained) imitation policy.

python3 motion_imitation/run_sac.py \
--mode train \
--motion_file [path to reference motion] \
--model_file [path to imitation model checkpoint] \
--getup_model_file [path to reset model checkpoint] \
--use_redq \
--int_save_freq 100 \
--num_test_episodes 20 \
--finetune \
--real_robot
  • adding --finetune performs fine-tuning, otherwise hyperparameters for pre-training will be used.
  • adding --real_robot will run training on the real A1 (see below to install necessary packages for running the real A1). If this is omitted, training will run in simulation.

To run two SAC trainers, one learning to walk forward and one backward, add a reference and checkpoint for another policy and use the multitask flag.

python motion_imitation/run_sac.py \
--mode train \
--motion_file motion_imitation/data/motions/pace.txt \
--backward_motion_file motion_imitation/data/motions/pace_backward.txt \
--model_file [path to forward imitation model checkpoint] \
--backward_model_file [path to backward imitation model checkpoint] \
--getup_model_file [path to reset model checkpoint] \
--use_redq \
--int_save_freq 100 \
--num_test_episodes 20 \
--real_robot \
--finetune \
--multitask

Running MPC on the real A1 robot

Since the SDK from Unitree is implemented in C++, we find the optimal way of robot interfacing to be via C++-python interface using pybind11.

Step 1: Build and Test the robot interface

To start, build the python interface by running the following: bash cd third_party/unitree_legged_sdk mkdir build cd build cmake .. make Then copy the built robot_interface.XXX.so file to the main directory (where you can see this README.md file).

Step 2: Setup correct permissions for non-sudo user

Since the Unitree SDK requires memory locking and high-priority process, which is not usually granted without sudo, add the following lines to /etc/security/limits.conf:


   
     soft memlock unlimited

    
      hard memlock unlimited

     
       soft nice eip

      
        hard nice eip

      
     
    
   

You may need to reboot the computer for the above changes to get into effect.

Step 3: Test robot interface.

Test the python interfacing by running: 'sudo python3 -m motion_imitation.examples.test_robot_interface'

If the previous steps were completed correctly, the script should finish without throwing any errors.

Note that this code does not do anything on the actual robot.

Running the Whole-body MPC controller

To see the whole-body MPC controller in sim, run: bash python3 -m motion_imitation.examples.whole_body_controller_example

To see the whole-body MPC controller on the real robot, run: bash sudo python3 -m motion_imitation.examples.whole_body_controller_robot_example

Owner
Laura Smith
Laura Smith
ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

VOODOO Revealing supercooled liquid beyond lidar attenuation Explore the docs » Report Bug · Request Feature Table of Contents About The Project Built

remsens-lim 2 Apr 28, 2022
AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation A pytorch-version implementation codes of paper:

11 Dec 13, 2022
This program automatically runs Python code copied in clipboard

CopyRun This program runs Python code which is copied in clipboard WARNING!! USE AT YOUR OWN RISK! NO GUARANTIES IF ANYTHING GETS BROKEN. DO NOT COPY

vertinski 4 Sep 10, 2021
Trafffic prediction analysis using hybrid models - Machine Learning

Hybrid Machine learning Model Clone the Repository Create a new Directory as assests and download the model from the below link Model Link To Start th

1 Feb 08, 2022
Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style [NeurIPS 2021] Official code to reproduce the results and data p

Yash Sharma 27 Sep 19, 2022
This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.

Diabetes Prediction Using SVM I explore a diabetes prediction algorithm using a Diabetes dataset. Using a Support Vector Machine for my prediction alg

Jeff Shen 1 Jan 14, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 24 Oct 26, 2022
Fast, modular reference implementation and easy training of Semantic Segmentation algorithms in PyTorch.

TorchSeg This project aims at providing a fast, modular reference implementation for semantic segmentation models using PyTorch. Highlights Modular De

ycszen 1.4k Jan 02, 2023
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding PyTorch implementation for the Scalable Attentive Sentence-Pair Modeling vi

Microsoft 25 Dec 02, 2022
[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Efficient Graph Similarity Computation - (EGSC) This repo contains the source code and dataset for our paper: Slow Learning and Fast Inference: Effici

24 Dec 31, 2022
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images Hong Wang, Yuexiang Li, Haimiao Zhang, Deyu Men

Hong Wang 4 Dec 27, 2022
PSPNet in Chainer

PSPNet This is an unofficial implementation of Pyramid Scene Parsing Network (PSPNet) in Chainer. Training Requirement Python 3.4.4+ Chainer 3.0.0b1+

Shunta Saito 76 Dec 12, 2022
⚓ Eurybia monitor model drift over time and securize model deployment with data validation

View Demo · Documentation · Medium article 🔍 Overview Eurybia is a Python library which aims to help in : Detecting data drift and model drift Valida

MAIF 172 Dec 27, 2022
A font family with a great monospaced variant for programmers.

Fantasque Sans Mono A programming font, designed with functionality in mind, and with some wibbly-wobbly handwriting-like fuzziness that makes it unas

Jany Belluz 6.3k Jan 08, 2023
The most simple and minimalistic navigation dashboard.

Navigation This project follows a goal to have simple and lightweight dashboard with different links. I use it to have my own self-hosted service dash

Yaroslav 23 Dec 23, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real Time Video Interpolation arXiv | YouTube | Colab | Tutorial | Demo Table of Contents Introduction Collection Usage Evaluation Training and

hzwer 3k Jan 04, 2023
Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically.

Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically. The collected data will then be used to train a deep neural network that can

Martin Valchev 3 Apr 24, 2022
PyTorch implementation of Pointnet2/Pointnet++

Pointnet2/Pointnet++ PyTorch Project Status: Unmaintained. Due to finite time, I have no plans to update this code and I will not be responding to iss

Erik Wijmans 1.2k Dec 29, 2022
A Simple Example for Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env

Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env This repository implements a simple algorithm for imitation learning: DAGGER. In thi

Hao 66 Nov 23, 2022
Pytorch implementation of face attention network

Face Attention Network Pytorch implementation of face attention network as described in Face Attention Network: An Effective Face Detector for the Occ

Hooks 312 Dec 09, 2022