An implementation of the AdaOPS (Adaptive Online Packing-based Search), which is an online POMDP Solver used to solve problems defined with the POMDPs.jl generative interface.

Overview

AdaOPS

Build Status

Coverage Status

codecov.io

An implementation of the AdaOPS (Adaptive Online Packing-guided Search), which is an online POMDP Solver used to solve problems defined with the POMDPs.jl generative interface. The paper of AdaOPS was published on NeurIPS'2021.

If you are trying to use this package and require more documentation, please file an issue!

Installation

Press ] key to enter the package management mode of Julia. Then, execute the following code.

pkg> add "POMDPs"
pkg> registry add "https://github.com/JuliaPOMDP/Registry.git"
pkg> add AdaOPS

Usage

using POMDPs, POMDPModels, POMDPSimulators, AdaOPS

pomdp = TigerPOMDP()

solver = AdaOPSSolver(bounds=IndependentBounds(-20.0, 0.0))
planner = solve(solver, pomdp)

for (s, a, o) in stepthrough(pomdp, planner, "s,a,o", max_steps=10)
    println("State was $s,")
    println("action $a was taken,")
    println("and observation $o was received.\n")
end

For minimal examples of problem implementations, see this notebook and the POMDPs.jl generative docs.

Solver Options

Solver options can be found in the AdaOPSSolver docstring and accessed using Julia's built in documentation system (or directly in the Solver source code). Each option has its own docstring and can be set with a keyword argument in the AdaOPSSolver constructor.

Belief Packing

delta

A δ-packing of observation branches will be generated, i.e., the belief nodes with L1 distance less than delta are merged.

Adaptive Particle Filter

The core idea of the adaptive particle filter is that it can change the number of particles adaptively and use more particles to estimate the belief when needed.

grid

grid is used to split the state space into multidimensional bins, so that KLD-Sampling can estimate the particle numbers according to the number of bins occupied. First, a function for converting a state to a multidimensional vector should be implemented, i.e., Base.convert(::Type{SVector{D, Float64}},::S), where D is the dimension of the resulted vector. Then, we define a StateGrid to discretize or split the state space. A StateGrid is consist of a vector of cutpoints in each dimension. These cutpoints divide the whole space into small tiles. In each dimension, a number of intervals constitute the grid, and each of these intervals is left-closed and right-open with the endpoints be cutpoints with the exception of the left-most interval. For example, a StateGrid can be defined as StateGrid([dim1_cutpoints], [dim2_cutpoints], [dim3_cutpoints]). All states lie in one tile will be taken as the same. With the number of tiles (bins) occupied, we can estimate the number of particles using KLD-Sampling.

max_occupied_bins

max_occupied_bins is the maximum number of bins occupied by a belief. Normally, it is exactly the grid size. However, in some domains, such as Roomba, only states within the room is accessible, and the corresponding bins will never be occupied.

min_occupied_bins

min_occupied_bins is the minimum number of bins occupied by a belief. Normally, it default to 2. A belief occupying min_occupied_bins tiles will be estimated with m_min particles. Increasing min_occupied_bins indicates that a belief need to occupy more bins so as to be estimated by the same amount of particles.

m_min

m_min is the minimum number of particles used for approximating beliefs.

m_max

m_max is the maximum number of particles used for approximating a belief. Normally, m_max is set to be big enough so that KLD-Sampling determines the number of particles used. When the KLD-Sampling is disabled, i.e. grid=StateGrid(), m_max will be sampled during the resampling.

zeta

zeta is the target error when estimating a belief. Spcifically, we use KLD Sampling to calculate the number of particles needed, where zeta is the targe Kullback-Leibler divergence between the estimated belief and the true belief. In AdaOPS, zeta is automatically adjusted according to the minimum number of bins occupied such that the minimum number of particles KLD-Sampling method suggests is exactly m_min.

Bounds

Dependent bounds

The bound passed into AdaOPSSolver can be a function in the form of lower_bound, upper_bound = f(pomdp, wpf_belief), or any other objects for which a AdaOPS.bounds(obj::OBJECT, pomdp::POMDP, b::WPFBelief, max_depth::Int, bounds_warning::Bool) function is implemented.

Independent bounds

In most cases, the recommended way to specify bounds is with an IndependentBounds object, i.e.

AdaOPSSolver(bounds=IndependentBounds(lower, upper))

where lower and upper are either a number, a function or some other objects (see below).

Often, the lower bound is calculated with a default policy, this can be accomplished using a PORollout, FORollout or RolloutEstimator. For the in-depth details, please refer to BasicPOMCP. Note that when mixing the Rollout structs from this package with those from BasicPOMCP, you should prefix the struct name with module name.

Both the lower and upper bounds can be initialized with value estimations using a FOValue or POValue. FOValue support any offline MDP Solver or Policy. POValue support any offline POMDP Solver or Policy.

If lower or upper is a function, it should handle two arguments. The first is the POMDP object and the second is the WPFBelief. To access the state particles in a WPFBelief b, use particles(b). To access the corresponding weights of particles in a WPFBelief b, use weights(b). All AbstractParticleBelief APIs are supported for WPFBelief. More details can be found in the solver source code.

If an object o is passed in, AdaOPS.bound(o, pomdp::POMDP, b::WPFBelief, max_depth::Int) will be called.

In most cases, the check_terminal and consistency_fix_thresh keyword arguments of IndependentBounds should be used to add robustness (see the IndependentBounds docstring for more info). When using rollout-base bounds, you can specify max_depth keyword argument to set the max depth of rollout.

Example

For the BabyPOMDP from POMDPModels, bounds setup might look like this:

using POMDPModels
using POMDPPolicies
using BasicPOMCP

always_feed = FunctionPolicy(b->true)
lower = FORollout(always_feed)

function upper(pomdp::BabyPOMDP, b::WPFBelief)
    if all(s==true for s in particles(b)) # all particles are hungry
        return pomdp.r_hungry # the baby is hungry this time, but then becomes full magically and stays that way forever
    else
        return 0.0 # the baby magically stays full forever
    end
end

solver = AdaOPSSolver(bounds=IndependentBounds(lower, upper))

Visualization

D3Trees.jl can be used to visualize the search tree, for example

using POMDPs, POMDPModels, POMDPModelTools, D3Trees, AdaOPS

pomdp = TigerPOMDP()

solver = AdaOPSSolver(bounds=(-20.0, 0.0), tree_in_info=true)
planner = solve(solver, pomdp)
b0 = initialstate(pomdp)

a, info = action_info(planner, b0)
inchrome(D3Tree(info[:tree], init_expand=5))

will create an interactive tree.

Analysis

Two utilities, namely info_analysis and hist_analysis, are provided for getting a sense of how the algorithm is working. info_analysis takes the infomation returned from action_info(planner, b0). It will first visualize the tree if the tree_in_info option is turned on. Then it will show stats such as number nodes expanded, total explorations, average observation branches, and so on. hist_analysis takes the hist from HistoryRecorder simulator. It will show similar stats as info_analysis but in the form of figures. It should be noted that HistoryRecoder will store the tree of each single step, which makes it memory-intensive. An example is shown as follows.

using POMDPs, AdaOPS, RockSample, POMDPSimulators, ParticleFilters, POMDPModelTools

m = RockSamplePOMDP(11, 11)

b0 = initialstate(m)
s0 = rand(b0)

bound = AdaOPS.IndependentBounds(FORollout(RSExitSolver()), FOValue(RSMDPSolver()), check_terminal=true, consistency_fix_thresh=1e-5)

solver = AdaOPSSolver(bounds=bound,
                        delta=0.3,
                        m_min=30,
                        m_max=200,
                        tree_in_info=true,
                        num_b=10_000
                        )

adaops = solve(solver, m)
a, info = action_info(adaops, b0)
info_analysis(info)

num_particles = 30000
@time hist = simulate(HistoryRecorder(max_steps=90), m, adaops, SIRParticleFilter(m, num_particles), b0, s0)
hist_analysis(hist)
@show undiscounted_reward(hist)

Reference

@inproceedings{wu2021adaptive,
  title={Adaptive Online Packing-guided Search for POMDPs},
  author={Wu, Chenyang and Yang, Guoyu and Zhang, Zongzhang and Yu, Yang and Li, Dong and Liu, Wulong and others},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}
You might also like...
AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

Minimal PyTorch implementation of Generative Latent Optimization from the paper
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

Camview - A CLI-tool used to stream CCTV online footage based on URL params
Camview - A CLI-tool used to stream CCTV online footage based on URL params

CamView A CLI-tool used to stream CCTV online footage based on URL params Get St

Wordplay, an artificial Intelligence based crossword puzzle solver.

Wordplay, AI based crossword puzzle solver A crossword is a word puzzle that usually takes the form of a square or a rectangular grid of white- and bl

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity
Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity, such as gratings, photonic-crystal slabs, metasurfaces, surface-emitting lasers, nano-antennas, and more.

A SAT-based sudoku solver
A SAT-based sudoku solver

SAT Sudoku solver A SAT-based Sudoku solver made in the context of a small project in the "Logic Problem Solving" class in the first year at the Polyt

Comments
  • TagBot trigger issue

    TagBot trigger issue

    This issue is used to trigger TagBot; feel free to unsubscribe.

    If you haven't already, you should update your TagBot.yml to include issue comment triggers. Please see this post on Discourse for instructions and more details.

    If you'd like for me to do this for you, comment TagBot fix on this issue. I'll open a PR within a few hours, please be patient!

    opened by JuliaTagBot 7
  • Unintuitive default timeout warning threshold

    Unintuitive default timeout warning threshold

    https://github.com/LAMDA-POMDP/AdaOPS.jl/blob/e8593ebec82f064efd08b2b34301dbb1011281aa/src/AdaOPS.jl#L143

    https://github.com/LAMDA-POMDP/AdaOPS.jl/blob/e8593ebec82f064efd08b2b34301dbb1011281aa/src/planner.jl#L15

    Default results in timeout warnings whenever planning time exceeds 2*T_max^2 . For planning times less than 1 second this results in far too many warnings printed to screen.

    For example, if we have T_max = 0.01, then the warning is triggered whenever planning time exceeds 0.0002s, which would be at every action call, provided that the max_trials option is set to be sufficiently high.

    Also, it may be worth looking at this issue seeing as CPUTime seems to have a lot of overhead, especially for problems with quick gen functions

    opened by WhiffleFish 1
  • CompatHelper: bump compat for

    CompatHelper: bump compat for "Distributions" to "0.25"

    This pull request changes the compat entry for the Distributions package from 0.22 - 0.24 to 0.22 - 0.24, 0.25.

    This keeps the compat entries for earlier versions.

    Note: I have not tested your package with this new compat entry. It is your responsibility to make sure that your package tests pass before you merge this pull request.

    opened by github-actions[bot] 1
  • CompatHelper: add new compat entry for

    CompatHelper: add new compat entry for "POMDPModels" at version "0.4"

    This pull request sets the compat entry for the POMDPModels package to 0.4.

    This is a brand new compat entry. Previously, you did not have a compat entry for the POMDPModels package.

    Note: I have not tested your package with this new compat entry. It is your responsibility to make sure that your package tests pass before you merge this pull request. Note: Consider tagging a patch release immediately after merging this PR, as downstream packages may depend on this for tests to pass.

    opened by github-actions[bot] 0
Releases(v0.5.3)
4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects 4st place solution for the PBVS 2022 Multi-modal Aerial View Ob

LinpengPan 5 Nov 09, 2022
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

Hesper 63 Jan 05, 2023
TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

Donny You 2.2k Jan 06, 2023
Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch = 1.8.1 transformers

Wenxuan Zhou 74 Nov 29, 2022
Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019) This repository contains python (3.5.2) implementation of

Doyup Lee 222 Dec 21, 2022
A containerized REST API around OpenAI's CLIP model.

OpenAI's CLIP — REST API This is a container wrapping OpenAI's CLIP model in a RESTful interface. Running the container locally First, build the conta

Santiago Valdarrama 48 Nov 06, 2022
Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020.

RegNet Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020. Paper | Official Implementation RegNet offer a very

Vishal R 2 Feb 11, 2022
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

47 Dec 16, 2022
This respository includes implementations on Manifoldron: Direct Space Partition via Manifold Discovery

Manifoldron: Direct Space Partition via Manifold Discovery This respository includes implementations on Manifoldron: Direct Space Partition via Manifo

dayang_wang 4 Apr 28, 2022
Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Wonjong Jang 8 Nov 01, 2022
In this project we use both Resnet and Self-attention layer for cat, dog and flower classification.

cdf_att_classification classes = {0: 'cat', 1: 'dog', 2: 'flower'} In this project we use both Resnet and Self-attention layer for cdf-Classification.

3 Nov 23, 2022
Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented at RAI 2021.

Can Active Learning Preemptively Mitigate Fairness Issues? Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented a

ElementAI 7 Aug 12, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

DV Lab 182 Dec 29, 2022
Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker This is a full project of image segmentation using the model built with

Htin Aung Lu 1 Jan 04, 2022
Code for the paper "Location-aware Single Image Reflection Removal"

Location-aware Single Image Reflection Removal The shown images are provided by the datasets from IBCLN, ERRNet, SIR2 and the Internet images. The cod

72 Dec 08, 2022
Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available for research purposes.

Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available f

Yongrui Chen 5 Nov 10, 2022
Programming with Neural Surrogates of Programs

Programming with Neural Surrogates of Programs

0 Dec 12, 2021
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python 76 Sep 30, 2022
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

PTvsBT On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021) Citation Please cite a

Sunbow Liu 10 Nov 25, 2022
Awesome Remote Sensing Toolkit based on PaddlePaddle.

基于飞桨框架开发的高性能遥感图像处理开发套件,端到端地完成从训练到部署的全流程遥感深度学习应用。 最新动态 PaddleRS 即将发布alpha版本!欢迎大家试用 简介 PaddleRS是遥感科研院所、相关高校共同基于飞桨开发的遥感处理平台,支持遥感图像分类,目标检测,图像分割,以及变化检测等常用遥

146 Dec 11, 2022