SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Last update: Aug 30, 2022

Related tags

Deep Learning SAAVN

Overview

SAAVN

SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)

These code are under cleaning! Some of bugs maybe happen, please tell me if you have any trouble.

Thanks

These codes are based on the SoundSpaces code base.

Usage

This repo supports AudioGoal Task on Replica and Matterport3D datasets.

Below we show the commands for training and evaluating AudioGoal with Depth sensor on Replica, but it applies to Matterport dataset as well.

Training

python main.py --default av_nav --run-type train --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Validation (evaluate each checkpoint and generate a validation curve)

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Test the best validation checkpoint based on validation curve

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Generate demo video with audio

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Note: [exp_config_file] is the main parameter configuration file of the experiment, while [tag_config_file] is special parameter configuration file for abalation experiments.

Citation

If you use this model in your research, please cite the following paper:

@inproceedings{YinfengICLR2022saavn,
	title = {Sound Adversarial Audio-Visual Navigation},
	author = {Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu},
	year = {2022},
        booktitle={ICLR},
}

SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Related tags

Overview

SAAVN

SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)

These code are under cleaning! Some of bugs maybe happen, please tell me if you have any trouble.

Thanks

Usage

Citation

Owner

YinfengYu

GenshinMapAutoMarkTools - Tools To add/delete/refresh resources mark in Genshin Impact Map

Code for KHGT model, AAAI2021

Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

Building blocks for uncertainty-aware cycle consistency presented at NeurIPS'21.

Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

An SE(3)-invariant autoencoder for generating the periodic structure of materials

Efficient Lottery Ticket Finding: Less Data is More

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Download files from DSpace systems (because for some reason DSpace won't let you)

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Artificial Intelligence playing minesweeper 🤖

PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds

[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Dirty Pixels: Towards End-to-End Image Processing and Perception

Measures input lag without dedicated hardware, performing motion detection on recorded or live video

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)