A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

Last update: Dec 02, 2022

Overview

torch-cif

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.

Usage

def cif_function(
    input: Tensor,
    alpha: Tensor,
    beta: float = 1.0,
    padding_mask: Optional[Tensor] = None,
    target_lengths: Optional[Tensor] = None,
    max_output_length: Optional[int] = None,
    eps: float = 1e-4,
) -> Tuple[Tensor, Tensor, Tensor]:
    r""" A batched computation implementation of continuous integrate and fire (CIF)
    https://arxiv.org/abs/1905.11235

    Args:
        input (Tensor): (N, S, C) Input features to be integrated.
        alpha (Tensor): (N, S) Weights corresponding to each elements in the
            input. It is expected to be after sigmoid function.
        beta (float): the threshold used for determine firing.
        padding_mask (Tensor, optional): (N, S) A binary mask representing
            padded elements in the input.
        target_lengths (Tensor, optional): (N,) Desired length of the targets
            for each sample in the minibatch.
        max_output_length (int, optional): The maximum valid output length used
            in inference. The alpha is scaled down if the sum exceeds this value.
        eps (float, optional): Epsilon to prevent underflow for divisions.
            Default: 1e-4

    Returns: Tuple (output, feat_lengths, alpha_sum)
        output (Tensor): (N, T, C) The output integrated from the source.
        feat_lengths (Tensor): (N,) The output length for each element in batch.
        alpha_sum (Tensor): (N,) The sum of alpha for each element in batch.
            Can be used to compute the quantity loss.
    """

Note

ℹ️ This is a WIP project. the implementation is still being tested.

This implementation uses cumsum and floor to determine the firing positions, and use scatter to merge the weighted source features.
Run test by python test.py (requires pip install expecttest).
Feel free to contact me if there are bugs in the code.

Reference

CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

Related tags

Overview

torch-cif

Usage

Note

Reference

Owner

張致強

Train Yolov4 using NBX-Jobs

Train DeepLab for Semantic Image Segmentation

A large-scale database for graph representation learning

Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

A simple, fast, and efficient object detector without FPN

This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

This script runs neural style transfer against the provided content image.

Code for "Learning Graph Cellular Automata"

Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

Model Zoo for AI Model Efficiency Toolkit

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Paper Title: Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

PyTorch and Tensorflow functional model definitions

Technical experimentations to beat the stock market using deep learning :chart_with_upwards_trend:

Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

Code for "LoRA: Low-Rank Adaptation of Large Language Models"