Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Last update: Nov 11, 2022

Related tags

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Move to the state-SAC-LFF repository.

cd state-SAC-LFF

To install the dependencies, use the provided environment.yml file

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 3
python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 2 \
               --network_class FourierMLP --sigma 0.001 --fourier_dim 1024 --train_B --concatenate_fourier

The only thing that changes between the baseline is the number of hidden layers (we reduce by 1 to keep parameter count roughly the same), the network_class, the fourier_dim, sigma, train_B, and concatenate_fourier.

Image-space Soft Actor-Critic Experiments

Move to the image-SAC-LFF repository.

cd image-SAC-LFF

Install RAD dependencies:

conda env create -f conda_env.yml

To run an experiment, the template for CNN and CNN+LFF experiments, respectively, are:

python train.py --domain_name hopper --task_name hop --encoder_type fourier_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000 --fourier_dim 128 --sigma 0.1 --train_B --concatenate_fourier
python train.py --domain_name hopper --task_name hop --encoder_type fair_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000

Proximal Policy Optimization Experiments

Move to the state-PPO-LFF repository.

cd pytorch-a2c-ppo-acktr-gail

Install PPO dependencies:

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class MLP --n_hidden 2 --seed 10
python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class FourierMLP --n_hidden 2 --sigma 0.01 --fourier_dim 64 \ 
               --concatenate_fourier --train_B --seed 10

Acknowledgements

We built the state-based SAC codebase off the TD3 repo by Fujimoto et al. We especially appreciated its lightweight bare-bones training loop. For the state-based SAC algorithm implementation and hyperparameters, we used this PyTorch SAC repo by Yarats and Kostrikov. For the SAC+RAD image-based experiments, we used the authors' implementation. Finally, we built off this PPO codebase by Ilya Kostrikov.

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Related tags

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Image-space Soft Actor-Critic Experiments

Proximal Policy Optimization Experiments

Acknowledgements

Owner

Alex Li

Codebase for Image Classification Research, written in PyTorch.

General purpose Slater-Koster tight-binding code for electronic structure calculations

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Trained on Simulated Data, Tested in the Real World

Turning SymPy expressions into PyTorch modules.

OpenMMLab Model Deployment Toolset

LinkNet - This repository contains our Torch7 implementation of the network developed by us at e-Lab.

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

Winning solution of the Indoor Location & Navigation Kaggle competition

In the case of your data having only 1 channel while want to use timm models

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

NEO: Non Equilibrium Sampling on the orbit of a deterministic transform

SGoLAM - Simultaneous Goal Localization and Mapping

DROPO: Sim-to-Real Transfer with Offline Domain Randomization

A package for music online and offline rhythmic information analysis including music Beat, downbeat, tempo and meter tracking.

Bulk2Space is a spatial deconvolution method based on deep learning frameworks

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.