The Empirical Investigation of Representation Learning for Imitation (EIRLI)

Related tags

Deep Learningeirli
Overview

The Empirical Investigation of Representation Learning for Imitation (EIRLI)

Documentation status Dataset download link

Over the past handful of years, representation learning has exploded as a subfield, and, with it have come a plethora of new methods, each slightly different from the other.

Our Empirical Investigation of Representation Learning for Imitation (EIRLI) has two main goals:

  1. To create a modular algorithm definition system that allows researchers to easily pick and choose from a wide array of commonly used design axes
  2. To facilitate testing of representations within the context of sequential learning, particularly imitation learning and offline reinforcement learning

Common Use Cases

Do you want to…

  • Reproduce our results? You can find scripts and instructions here to help reproduce our benchmark results.
  • Design and experiment with a new representation learning algorithm using our modular components? You can find documentation on that here
  • Use our algorithm definitions in a setting other than sequential learning? The base example here demonstrates this simplified use case

Otherwise, you can see our full ReadTheDocs documentation here.

Modular Algorithm Design

This library was designed in a way that breaks down the definition of a representation learning algorithm into several key parts. The intention was that this system be flexible enough many commonly used algorithms can be defined through different combinations of these modular components.

The design relies on the central concept of a "context" and a "target". In very rough terms, all of our algorithms work by applying some transformation to the context, some transformation to the target, and then calculating a loss as a function of those two transformations. Sometimes an extra context object is passed in

Some examples are:

  • In SimCLR, the context and target are the same image frame, and augmentation and then encoding is applied to both context and target. That learned representation is sent through a decoder, and then the context and target representations are pulled together with a contrastive loss.
  • In TemporalCPC, the context is a frame at time t, and the target a frame at time t+k, and then, similarly to SimCLR above, augmentation is applied to the frame before it's put through an encoder, and the two resulting representations pulled together
  • In a Variational Autoencoder, the context and target are the same image frame. An bottleneck encoder and then a reconstructive decoder are applied to the context, and this reconstructed context is compared to the target through a L2 pixel loss
  • A Dynamics Prediction model can be seen as an conceptual combination of an autoencoder (which tries to predict the current full image frame) and TemporalCPC, which predicts future information based on current information. In the case of a Dynamics model, we predict a future frame (the target) given the current frame (context) and an action as extra context.

This abstraction isn't perfect, but we believe it is coherent enough to allow for a good number of shared mechanisms between algorithms, and flexible enough to support a wide variety of them.

The modular design mentioned above is facilitated through the use of a number of class interfaces, each of which handles a different component of the algorithm. By selecting different implementations of these shared interfaces, and creating a RepresentationLearner that takes them as arguments, and handles the base machinery of performing transformations.

A diagram showing how these components made up a training pipeline for our benchmark

  1. TargetPairConstructer - This component takes in a set of trajectories (assumed to be iterators of dicts containing 'obs' and optional 'acts', and 'dones' keys) and creates a dataset of (context, target, optional extra context) pairs that will be shuffled to form the training set.
  2. Augmenter - This component governs whether either or both of the context and target objects are augmented before being passed to the encoder. Note that this concept only meaningfully applies when the object being augmented is an image frame.
  3. Encoder - The encoder is responsible for taking in an image frame and producing a learned vector representation. It is optionally chained with a Decoder to produce the input to the loss function (which may be a reconstructed image in the case of VAE or Dynamics, or may be a projected version of the learned representation in the case of contrastive methods like SimCLR that use a projection head)
  4. Decoder - As mentioned above, the Decoder acts as a bridge between the representation in the form you want to use for transfer, and whatever input is required your loss function, which is often some transformation of that canonical representation.
  5. BatchExtender - This component is used for situations where you want to calculate loss on batch elements that are not part of the batch that went through your encoder and decoder on this step. This is centrally used for contrastive methods that use momentum, since in that case, you want to use elements from a cached store of previously-calculated representations as negatives in your contrastive loss
  6. LossCalculator - This component takes in the transformed context and transformed target and handles the loss calculation, along with any transformations that need to happen as a part of that calculation.

Training Scripts

In addition to machinery for constructing algorithms, the repo contains a set of Sacred-based training scripts for testing different Representation Learning algorithms as either pretraining or joint training components within an imitation learning pipeline. These are likeliest to be a fit for your use case if you want to reproduce our results, or train models in similar settings

Owner
Center for Human-Compatible AI
CHAI seeks to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems.
Center for Human-Compatible AI
Controlling a game using mediapipe hand tracking

These scripts use the Google mediapipe hand tracking solution in combination with a webcam in order to send game instructions to a racing game. It features 2 methods of control

3 May 17, 2022
Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

Net2Net Code accompanying the NeurIPS 2020 oral paper Network-to-Network Translation with Conditional Invertible Neural Networks Robin Rombach*, Patri

CompVis Heidelberg 206 Dec 20, 2022
Modeling CNN layers activity with Gaussian mixture model

GMM-CNN This code package implements the modeling of CNN layers activity with Gaussian mixture model and Inference Graphs visualization technique from

3 Aug 05, 2022
Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

Continuous Control With Ensemble Deep Deterministic Policy Gradients This repository is the official implementation of Continuous Control With Ensembl

4 Dec 06, 2021
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from

Jan Sosulski 5 Nov 07, 2022
Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

Semantic Segmentation Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch Features Applicable to followin

sithu3 530 Jan 05, 2023
Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation", if you find this useful and use

57 Dec 27, 2022
GAN-generated image detection based on CNNs

GAN-image-detection This repository contains a GAN-generated image detector developed to distinguish real images from synthetic ones. The detector is

Image and Sound Processing Lab 17 Dec 15, 2022
PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

Saiency Map-aided GAN for RAW2RGB Mapping The PyTorch implementations and guideline for Saiency Map-aided GAN for RAW2RGB Mapping. 1 Implementations B

Yuzhi ZHAO 20 Oct 24, 2022
Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL, with support of notifications and distributed workload of that work

Américo Júnior 3 Mar 11, 2022
AirCode: A Robust Object Encoding Method

AirCode This repo contains source codes for the arXiv preprint "AirCode: A Robust Object Encoding Method" Demo Object matching comparison when the obj

Chen Wang 30 Dec 09, 2022
Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

Arun Mallya 165 Nov 22, 2022
Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Learning Synthetic Environments and Reward Networks for Reinforcement Learning We explore meta-learning agent-agnostic neural Synthetic Environments (

AutoML-Freiburg-Hannover 16 Sep 02, 2022
Self-supervised Deep LiDAR Odometry for Robotic Applications

DeLORA: Self-supervised Deep LiDAR Odometry for Robotic Applications Overview Paper: link Video: link ICRA Presentation: link This is the correspondin

Robotic Systems Lab - Legged Robotics at ETH Zürich 181 Dec 29, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
《A-CNN: Annularly Convolutional Neural Networks on Point Clouds》(2019)

A-CNN: Annularly Convolutional Neural Networks on Point Clouds Created by Artem Komarichev, Zichun Zhong, Jing Hua from Department of Computer Science

Artёm Komarichev 44 Feb 24, 2022
I tried to apply the CAM algorithm to YOLOv4 and it worked.

YOLOV4:You Only Look Once目标检测模型在pytorch当中的实现 2021年2月7日更新: 加入letterbox_image的选项,关闭letterbox_image后网络的map得到大幅度提升。 目录 性能情况 Performance 实现的内容 Achievement

55 Dec 05, 2022
Python package to add text to images, textures and different backgrounds

nider Python package for text images generation and watermarking Free software: MIT license Documentation: https://nider.readthedocs.io. nider is an a

Vladyslav Ovchynnykov 131 Dec 30, 2022
DM-ACME compatible implementation of the Arm26 environment from Mujoco

ACME-compatible implementation of Arm26 from Mujoco This repository contains a customized implementation of Mujoco's Arm26 model, that can be used wit

1 Dec 24, 2021