Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

Overview

Auto-Tuned Sim-to-Real Transfer

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer. The paper will be released shortly on arXiv.

This repository was forked from the CURL codebase.

Installation

Install mujoco, if it is not already installed.

Add this to bashrc:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/olivia/.mujoco/mujoco200/bin

Apt-install these packages:

sudo apt-get install libosmesa6-dev
sudo apt-get install patchelf

All of the dependencies are in the conda_env.yml file. They can be installed manually or with the following command:

conda env create -f conda_env.yml

Enter the environments directory and run

pip install -e .

Instructions

Here is an example experiment run command.

CUDA_VISIBLE_DEVICES=0 python train.py --gpudevice 0 --id S3000 --outer_loop_version 3 --dr --start_outer_loop 5000 --train_sim_param_every 1 --prop_alpha --update_sim_param_from both --alpha 0.1 --mean_scale 1.75 --train_range_scale .5 --domain_name dmc_ball_in_cup --task_name catch --action_repeat 4 --range_scale .5 --scale_large_and_small --dr_option simple_dr --save_tb --use_img --encoder_type pixel --num_eval_episodes 1 --seed 1 --num_train_steps 1000000 --encoder_feature_dim 64 --num_layers 4 --num_filters 32 --sim_param_layers 2 --sim_param_units 400 --sim_param_lr .001 --prop_range_scale --prop_train_range_scale --separate_trunks --num_sim_param_updates 3 --save_video --eval_freq 2000 --num_eval_episodes 3 --save_model --save_buffer --no_train_policy
--outer_loop_version refers to the method by which we update simulation parameters. 1 means we update with regression, and 3 means binary classifier.
--scale_large_and_small means that half of the mean values in our simulation randomization will be randomly chosen to be too large, and the other half will be too small. If this flag is not provided, they will all be too large.
--mean_scale refers to the mean of the simulator distribution. A mean of k means that all simulation parameters are k times or 1/k times the true mean (randomly chosen for each param).
--range_scale refers to the range of the uniform distribution we use to collect samples to train the policy.
--train_range_scale refers to the range of the uniform distribution we use to collect samples to train the Search Param Model. This value is typically set >= to --range_scale.
--prop_range_scale and --prop_train_range_scale make the distribution ranges a scale multiple of the mean value rather than constants.

Check train.py for a full list of run commands.

During training, in your console, you should see printouts that look like:

| train | E: 221 | S: 28000 | D: 18.1 s | R: 785.2634 | BR: 3.8815 | A_LOSS: -305.7328 | CR_LOSS: 190.9854 | CU_LOSS: 0.0000
| train | E: 225 | S: 28500 | D: 18.6 s | R: 832.4937 | BR: 3.9644 | A_LOSS: -308.7789 | CR_LOSS: 126.0638 | CU_LOSS: 0.0000
| train | E: 229 | S: 29000 | D: 18.8 s | R: 683.6702 | BR: 3.7384 | A_LOSS: -311.3941 | CR_LOSS: 140.2573 | CU_LOSS: 0.0000
| train | E: 233 | S: 29500 | D: 19.6 s | R: 838.0947 | BR: 3.7254 | A_LOSS: -316.9415 | CR_LOSS: 136.5304 | CU_LOSS: 0.0000

Log abbreviation mapping:

train - training episode
E - total number of episodes 
S - total number of environment steps
D - duration in seconds to train 1 episode
R - mean episode reward
BR - average reward of sampled batch
A_LOSS - average loss of actor
CR_LOSS - average loss of critic
CU_LOSS - average loss of the CURL encoder

All data related to the run is stored in the specified working_dir. To enable model or video saving, use the --save_model or --save_video flags. For all available flags, inspect train.py. To visualize progress with tensorboard run:

tensorboard --logdir log --port 6006

and go to localhost:6006 in your browser. If you're running headlessly, try port forwarding with ssh.

For GPU accelerated rendering, make sure EGL is installed on your machine and set export MUJOCO_GL=egl. For environment troubleshooting issues, see the DeepMind control documentation.

Debugging common installation errors

Error message ERROR: GLEW initalization error: Missing GL version

  • Make sure /usr/lib/x86_64-linux-gnu/libGLEW.so and /usr/lib/x86_64-linux-gnu/libGL.so exist. If not, apt-install them.
  • Try trying adding the powerset of those two paths to LD_PRELOAD.

Error Shadow framebuffer is not complete, error 0x8cd7

  • Like above, make sure libglew and libGL are installed.
  • If /usr/lib/nvidia exists but '/usr/lib/nvidia-430/(or some other number) does not exist, runln -s /usr/lib/nvidia /usr/lib/nvidia-430`. It may have to match the number of your nvidia driver, I'm not sure.
  • Consider adding that symlink to LD_LIBRARY PATH.
  • If /usr/lib/nvidia doesn't exist, and neither does /usr/lib/nvidia-xxx, then create the folder /usr/lib/nvidia /usr/lib/nvidia-430.

Error message `RuntimeError: Failed to initialize OpenGL:

  • Make sure MUJOCO_GL is correct (egl for DMC, osmesa for anything else).
GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

GraPE GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both of

AnacletoLab 194 Dec 29, 2022
Large-scale Hyperspectral Image Clustering Using Contrastive Learning, CIKM 21 Workshop

Spectral-spatial contrastive clustering (SSCC) Yaoming Cai, Yan Liu, Zijia Zhang, Zhihua Cai, and Xiaobo Liu, Large-scale Hyperspectral Image Clusteri

Yaoming Cai 4 Nov 02, 2022
An unreferenced image captioning metric (ACL-21)

UMIC This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Cont

hwanheelee 14 Nov 20, 2022
Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

DeepCreamPy Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak. A deep learning-based tool to automatically replace censored a

616 Jan 06, 2023
Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.

openpifpaf Continuously tested on Linux, MacOS and Windows: New 2021 paper: OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Te

VITA lab at EPFL 50 Dec 29, 2022
a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

RNN-Playwrite a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LS

Arno Barton 1 Oct 29, 2021
Fake videos detection by tracing the source using video hashing retrieval.

Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos 🎉️ 📜 Directory Introduction VTL Trace Samples and Acc of Hash

56 Dec 22, 2022
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

Autoformer (NeurIPS 2021) Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting Time series forecasting is a c

THUML @ Tsinghua University 847 Jan 08, 2023
基于PaddleClas实现垃圾分类,并转换为inference格式用PaddleHub服务端部署

百度网盘链接及提取码: 链接:https://pan.baidu.com/s/1HKpgakNx1hNlOuZJuW6T1w 提取码:wylx 一个垃圾分类项目带你玩转飞桨多个产品(1) 基于PaddleClas实现垃圾分类,导出inference模型并利用PaddleHub Serving进行服务

thomas-yanxin 22 Jul 12, 2022
A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Biomedical Computer Vision @ Uniandes 52 Dec 19, 2022
A repo for Causal Imitation Learning under Temporally Correlated Noise

CausIL A repo for Causal Imitation Learning under Temporally Correlated Noise. Running Experiments To re-train an expert, run: python experts/train_ex

Gokul Swamy 5 Nov 01, 2022
A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022
Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

DV Lab 116 Dec 20, 2022
Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Dominik Klein 189 Dec 21, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
This is the repository for Learning to Generate Piano Music With Sustain Pedals

SusPedal-Gen This is the official repository of Learning to Generate Piano Music With Sustain Pedals Demo Page Dataset The dataset used in this projec

Joann Ching 12 Sep 02, 2022
Convert scikit-learn models to PyTorch modules

sk2torch sk2torch converts scikit-learn models into PyTorch modules that can be tuned with backpropagation and even compiled as TorchScript. Problems

Alex Nichol 101 Dec 16, 2022
Discord-Protect is a simple discord bot allowing you to have some security on your discord server by ordering a captcha to the user who joins your server.

Discord-Protect Discord-Protect is a simple discord bot allowing you to have some security on your discord server by ordering a captcha to the user wh

Tir Omar 2 Oct 28, 2021
Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

CSRL Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning Python: 3

4 Apr 14, 2022