L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
A quick and dirty script to scan the network, find default credentials on services and post a message to a Slack channel with the results.

A quick and dirty script to scan the network, find default credentials on services and post a message to a Slack channel with the results.

Security Weekly 11 Jun 03, 2022
StudyLion is a Discord bot that tracks members' study and work time while offering members to view their statistics and use productivity tools such as: To-do lists, Pomodoro timers, reminders, and much more.

StudyLion - Discord Productivity Bot StudyLion is a Discord bot that tracks members' study and work time while offering members the ability to view th

45 Dec 26, 2022
Azure Neural Speech Service TTS

Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file.

Rodney 1 Oct 11, 2021
Automatically Message From Discord Account

Discord-AutoMessage A robust and versatile solution for automated social interactions HOW TO INSTALL Open cmd cd into your project directory Run the f

13 Jul 11, 2022
A simple API wrapper for the Tenor API

Gifpy A simple API wrapper for the Tenor API Installation Python 3.9 or higher is recommended python3 -m pip install gifpy Clone repository: $ git cl

Juan Ignacio Battiston 4 Dec 22, 2021
2b2t Priority queue discord bot announcer

2b2t Priority queue discord bot announcer Commands !prioq - Checks the priority queue length and sends it. !start - Starts a loop that sends the sta

Gumi 5 Jun 06, 2022
A fork of discord.py

discord.py A modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python. The Future of discord.py Please read the gi

1 Dec 19, 2021
A Simple Google Translate Bot By VndGroup ❤️ Made With Python

VndGroup Google Translator Heroku Deploy ❤️ Functions This Bot Can Translate 95 Languages We Can Set Custom Language Group Support Mandatory Vars [+]

Venuja Sadew 1 Oct 09, 2022
Python based Discord Bot with a simple music player

C32 Discord Bot Discord bot that plays music Table Of Contents About the Project Built With Acknowledgements About The Project Play music using the !p

Christopher Burwell 2 Oct 17, 2021
Autofill HZDR Zeitman entries

Zeitman_autofill Filling out Zeitman is boring. This script might make some of the pain go away. Requirements The selenium package and Chrome webdrive

Tim Callow 8 Mar 14, 2022
Sielzz Music adalah proyek bot musik telegram, memungkinkan Anda memutar musik di telegram grup obrolan suara.

Hi, I am: Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.8 or higher PyTgCalls MongoDB Get STRING_SESSION from below: 🎖 History Features 🔮 Th

1 Nov 04, 2021
Georeferencing large amounts of data for free.

Geolocate Georeferencing large amounts of data for free. Special thanks to @brunodepauloalmeida and the whole team for the contributions. How? It's us

Gabriel Gazola Milan 23 Dec 30, 2022
自动每天给女友发邮件

github acitons 发邮件 python 脚本 每天 7点半左右给女朋友发送邮件 天气来自: http://www.tianqiapi.com/ 文字图片来源:http://wufazhuce.com/ 风景图:https://qqlykm.cn/api/fengjing 土味情话:htt

gogobody 7 May 12, 2022
A telegram bot that sends a meme a day, from reddit's top meme of the day

MemeBot A telegram bot that sends a meme a day, from reddit's top meme of the day You can use the bot either with an external scheduler (ex: pythonany

Michele Vitulli 1 Dec 13, 2021
Embed the Duktape JS interpreter in Python

Introduction Pyduktape is a python wrapper around Duktape, an embeddable Javascript interpreter. On top of the interpreter wrapper, pyduktape offers e

Stefano 78 Dec 15, 2022
Auto Moderation is a powerfull moderation bot

Auto Moderation.py Auto Moderation a powerful Moderation Discord Bot 🎭 Futures Moderation Auto Moderation 🚀 Installation git clone https://github.co

G∙MAX 2 Apr 02, 2022
Python SDK for 42DI

42di Python SDK Install pip install git+https://github.com/42di/python-sdk import import di #42di import pandas_datareader as pdr Init SDK project =

42DI 2 Nov 03, 2021
Simple integrate of API musixmatch.com with python

Python Musixmatch Simple integrate of API musixmatch.com with python Quick start $ pip install pymusixmatch or $ python setup.py install Authenticatio

Hudson Brendon 79 Dec 20, 2022
Herramienta para transferir eventos de Sucuri WAF hacia Azure Monitor Log Analytics.

Transfiere eventos de Sucuri hacia Azure LogAnalytics Script para transferir eventos del Sucuri Web Application Firewall (WAF) hacia Azure LogAnalytic

CSIRT-RD 1 Dec 22, 2021
⚡ ʑɠ ცơɬ Is One Of The Fastest & Smoothest Bot On Telegram Based on Telethon ⚡

『ʑɠ ცơɬ』 ⚡ ʑɠ ცơɬ Is One Of The Fastest & Smoothest Bot On Telegram Based on Telethon ⚡ Status Of Bot Telegram 🏪 Dєρℓογ το нєяοκυ Variables APP_ID =

ʑɑʑɓɦɑɪ 0 Feb 12, 2022