PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Last update: Aug 19, 2022

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

This repository contains the PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing", the Sheffield entry for the first Clarity enhancement challenge (CEC1). The system consists of a Conv-TasNet based denoising module, and a finite-inpulse-response (FIR) filter based amplification module. A differentiable approximation to the Cambridge MSBG model released in the CEC1 is used in the loss function.

Requirements

To run the training recipe of the amplification module, the MSBG package and PyTorch STOI are needed.

Training

To build the overall system, the Conv-TasNet based denoising module needs to be trained in the first stage, and the scripts are in the recipe_den_convtasnet. The FIR based amplification module is trained in the second stage, and the scripts are in the recipe_amp_fir. The MBSTOI folder contains the MBSTOI implementation from the CEC1 project, with also the DBSTOI implementation.

References

[1] Luo Y, Mesgarani N. Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation[J]. IEEE/ACM transactions on audio, speech, and language processing, 2019, 27(8): 1256-1266.
[2] Andersen A H, de Haan J M, Tan Z H, et al. Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions[J]. Speech Communication, 2018, 102: 1-13.
[3] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', ICASSP 2010, Texas, Dallas.

Citation

If you use this work, please cite:

@article{tutwo,
  title={A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing},
  author={Tu, Zehai and Zhang, Jisi and Ma, Ning and Barker, Jon},
  year={2021},
  booktitle={The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021)},
}

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Related tags

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

Requirements

Training

References

Citation

Owner

PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Learning Saliency Propagation for Semi-supervised Instance Segmentation

Crawl & visualize ICLR papers and reviews

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

Image processing in Python

S2s2net - Sentinel-2 Super-Resolution Segmentation Network

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

Acute ischemic stroke dataset

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Adaout is a practical and flexible regularization method with high generalization and interpretability

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Quantum-enhanced transformer neural network

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

In Search of Probeable Generalization Measures

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Qt-GUI implementation of the YOLOv5 algorithm (ver.6 and ver.5)