Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Overview

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

This repository hosts the code related to the paper:

Marco Rosano, Antonino Furnari, Luigi Gulino, Corrado Santoro and Giovanni Maria Farinella, "Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation". Submitted to "Robotics and Autonomous Systems" (RAS), 2022.

For more details please see the project web page at https://iplab.dmi.unict.it/EmbodiedVN.

Overview

This code is built on top of the Habitat-api/Habitat-lab project. Please see the Habitat project page for more details.

This repository provides the following components:

  1. The implementation of the proposed tool, integrated with Habitat, to train visual navigation models on synthetic observations and test them on realistic episodes containing real-world images. This allows the estimation of real-world performance, avoiding the physical deployment of the robotic agent;

  2. The official PyTorch implementation of the proposed visual navigation models, which follow different strategies to combine a range of visual mid-level representations

  3. the synthetic 3D model of the proposed environment, acquired using the Matterport 3D scanner and used to perform the navigation episodes at train and test time;

  4. the photorealistic 3D model that contains real-world images of the proposed environment, labeled with their pose (X, Z, Angle). The sparse 3D reconstruction was performed using the COLMAP Structure from Motion tool, to then be aligned with the Matterport virtual 3D map.

  5. An integration with CycleGAN to train and evaluate navigation models with Habitat on sim2real adapted images.

  6. The checkpoints of the best performing navigation models.

Installation

Requirements

  • Python >= 3.7, use version 3.7 to avoid possible issues.
  • Other requirements will be installed via pip in the following steps.

Steps

  1. (Optional) Create an Anaconda environment and install all on it ( conda create -n fusion-habitat python=3.7; conda activate fusion-habitat )

  2. Install the Habitat simulator following the official repo instructions .The development and testing was done on commit bfbe9fc30a4e0751082824257d7200ad543e4c0e, installing the simulator "from source", launching the ./build.sh --headless --with-cuda command (guide). Please consider to follow these suggestions if you encounter issues while installing the simulator.

  3. Install the customized Habitat-lab (this repo):

    git clone https://github.com/rosanom/mid-level-fusion-nav.git
    cd mid-level-fusion-nav/
    pip install -r requirements.txt
    python setup.py develop --all # install habitat and habitat_baselines
    
  4. Download our dataset (journal version) from here, and extract it to the repository folder (mid-level-fusion-nav/). Inside the data folder you should see this structure:

    datasets/pointnav/orangedev/v1/...
    real_images/orangedev/...
    scene_datasets/orangedev/...
    orangedev_checkpoints/...
    
  5. (Optional, to check if the software works properly) Download the test scenes data and extract the zip file to the repository folder (mid-level-fusion-nav/). To verify that the tool was successfully installed, run python examples/benchmark.py or python examples/example.py.

Data Structure

All data can be found inside the mid-level-fusion-nav/data/ folder:

  • the datasets/pointnav/orangedev/v1/... folder contains the generated train and validation navigation episodes files;
  • the real_images/orangedev/... folder contains the real world images of the proposed environment and the csv file with their pose information (obtained with COLMAP);
  • the scene_datasets/orangedev/... folder contains the 3D mesh of the proposed environment.
  • orangedev_checkpoints/ is the folder where the checkpoints are saved during training. Place the checkpoint file here if you want to restore the training process or evaluate the model. The system will load the most recent checkpoint file.

Config Files

There are two configuration files:

habitat_domain_adaptation/configs/tasks/pointnav_orangedev.yaml

and

habitat_domain_adaptation/habitat_baselines/config/pointnav/ddppo_pointnav_orangedev.yaml.

In the first file you can change the robot's properties, the sensors used by the agent and the dataset used in the experiment. You don't have to modify it.

In the second file you can decide:

  1. if evaluate the navigation models using RGB or mid-level representations;
  2. the set of mid-level representations to use;
  3. the fusion architecture to use;
  4. if train or evaluate the models using real images, or using the CycleGAN sim2real adapted observations.
...
EVAL_W_REAL_IMAGES: True
EVAL_CKPT_PATH_DIR: "data/orangedev_checkpoints/"

SIM_2_REAL: False #use cycleGAN for sim2real image adaptation?

USE_MIDLEVEL_REPRESENTATION: True
MIDLEVEL_PARAMS:
ENCODER: "simple" # "simple", SE_attention, "mid_fusion", ...
FEATURE_TYPE: ["normal"] #["normal", "keypoints3d","curvature", "depth_zbuffer"]
...

CycleGAN Integration (baseline)

In order to use CycleGAN on Habitat for the sim2real domain adaptation during train or evaluation, follow the steps suggested in the repository of our previous resease.

Train and Evaluation

To train the navigation model using the DD-PPO RL algorithm, run:

sh habitat_baselines/rl/ddppo/single_node_orangedev.sh

To evaluate the navigation model using the DD-PPO RL algorithm, run:

sh habitat_baselines/rl/ddppo/single_node_orangedev_eval.sh

For more information about DD-PPO RL algorithm, please check out the habitat-lab dd-ppo repo page.

License

The code in this repository, the 3D models and the images of the proposed environment are MIT licensed. See the LICENSE file for details.

The trained models and the task datasets are considered data derived from the correspondent scene datasets.

Acknowledgements

This research is supported by OrangeDev s.r.l, by Next Vision s.r.l, the project MEGABIT - PIAno di inCEntivi per la RIcerca di Ateneo 2020/2022 (PIACERI) – linea di intervento 2, DMI - University of Catania, and the grant MIUR AIM - Attrazione e Mobilità Internazionale Linea 1 - AIM1893589 - CUP E64118002540007.

Owner
First Person Vision @ Image Processing Laboratory - University of Catania
First Person Vision @ Image Processing Laboratory - University of Catania
RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

RLMeta rlmeta - a flexible lightweight research framework for Distributed Reinforcement Learning based on PyTorch and moolib Installation To build fro

Meta Research 281 Dec 22, 2022
Landmarks Recogntion Web application using Streamlit.

Landmark Recognition Web-App using Streamlit Watch Tutorial for this project Source Trained model landmarks_classifier_asia_V1/1 is taken from the Ten

Kushal Bhavsar 5 Dec 12, 2022
structured-generative-modeling

This repository contains the implementation for the paper Information Theoretic StructuredGenerative Modeling, Specially thanks for the open-source co

0 Oct 11, 2021
CLASP - Contrastive Language-Aminoacid Sequence Pretraining

CLASP - Contrastive Language-Aminoacid Sequence Pretraining Repository for creating models pretrained on language and aminoacid sequences similar to C

Michael Pieler 133 Dec 29, 2022
A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Notice(2019.11.2) This repo was built back two years ago when there were no pytorch detection implementation that can achieve reasonable performance.

Ruotian(RT) Luo 1.8k Jan 01, 2023
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022
使用深度学习框架提取视频硬字幕;docker容器免安装深度学习库,使用本地api接口使得界面和后端识别分离;

extract-video-subtittle 使用深度学习框架提取视频硬字幕; 本地识别无需联网; CPU识别速度可观; 容器提供API接口; 运行环境 本项目运行环境非常好搭建,我做好了docker容器免安装各种深度学习包; 提供windows界面操作; 容器为CPU版本; 视频演示 https

歌者 16 Aug 06, 2022
Predictive Modeling on Electronic Health Records(EHR) using Pytorch

Predictive Modeling on Electronic Health Records(EHR) using Pytorch Overview Although there are plenty of repos on vision and NLP models, there are ve

81 Jan 01, 2023
An implementation of based on pytorch and mmcv

FisherPruning-Pytorch An implementation of Group Fisher Pruning for Practical Network Compression based on pytorch and mmcv Main Functions Pruning f

Peng Lu 15 Dec 17, 2022
Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

11 Aug 29, 2022
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

Hanxun Huang 11 Nov 30, 2022
A computer vision pipeline to identify the "icons" in Christian paintings

Christian-Iconography A computer vision pipeline to identify the "icons" in Christian paintings. A bit about iconography. Iconography is related to id

Rishab Mudliar 3 Jul 30, 2022
SAS: Self-Augmentation Strategy for Language Model Pre-training

SAS: Self-Augmentation Strategy for Language Model Pre-training This repository

Alibaba 5 Nov 02, 2022
Apply AnimeGAN-v2 across frames of a video clip

title emoji colorFrom colorTo sdk app_file pinned AnimeGAN-v2 For Videos 🔥 blue red gradio app.py false AnimeGAN-v2 For Videos Apply AnimeGAN-v2 acro

Nathan Raw 36 Oct 18, 2022
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 09, 2023
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)

Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer) Introduction By applying the

Son Gyo Jung 1 Jul 09, 2022
Implementation for NeurIPS 2021 Submission: SparseFed

READ THIS FIRST This repo is an anonymized version of an existing repository of GitHub, for the AIStats 2021 submission: SparseFed: Mitigating Model P

2 Jun 15, 2022
Double pendulum simulator using a symplectic Euler's method and Hamiltonian mechanics

Symplectic Double Pendulum Simulator Double pendulum simulator using a symplectic Euler's method. The program calculates the momentum and position of

Scott Marino 1 Jan 12, 2022
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed

sebis - TUM - Germany 61 Dec 20, 2022