An implementation of the efficient attention module.

Last update: Dec 15, 2022

Overview

Efficient Attention

An implementation of the efficient attention module.

Description

Efficient attention is an attention mechanism that substantially optimizes the memory and computational efficiency while retaining exactly the same expressive power as the conventional dot-product attention. The illustration above compares the two types of attention. The efficient attention module is a drop-in replacement for the non-local module (Wang et al., 2018), while it:

uses less resources to achieve the same accuracy;
achieves higher accuracy with the same resource constraints (by allowing more insertions); and
is applicable in domains and models where the non-local module is not (due to resource constraints).

Resources

YouTube:

Presentation: https://youtu.be/_wnjhTM04NM

bilibili (for users in Mainland China):

Presentation: https://www.bilibili.com/video/BV1tK4y1f7Rm
Presentation in Chinese: https://www.bilibili.com/video/bv1Gt4y1Y7E3

Implementation details

This repository implements the efficient attention module with softmax normalization, output reprojection, and residual connection.

Features not in the paper

This repository implements additionally implements the multi-head mechanism which was not in the paper. To learn more about the mechanism, refer to Vaswani et al.

Citation

The paper will appear at WACV 2021. If you use, compare with, or refer to this work, please cite

@inproceedings{shen2021efficient,
    author = {Zhuoran Shen and Mingyuan Zhang and Haiyu Zhao and Shuai Yi and Hongsheng Li},
    title = {Efficient Attention: Attention with Linear Complexities},
    booktitle = {WACV},
    year = {2021},
}

An implementation of the efficient attention module.

Related tags

Overview

Efficient Attention

Description

Resources

Implementation details

Features not in the paper

Citation

Owner

Shen Zhuoran

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

LyaNet: A Lyapunov Framework for Training Neural ODEs

Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

Remote sensing change detection using PaddlePaddle

PyTorch implementation of federated learning framework based on the acceleration of global momentum

STBP is a way to train SNN with datasets by Backward propagation.

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

null

This project uses Template Matching technique for object detecting by detection of template image over base image.

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

Official page of Patchwork (RA-L'21 w/ IROS'21)

A toy compiler that can convert Python scripts to pickle bytecode 🥒

Code for Greedy Gradient Ensemble for Visual Question Answering （ICCV 2021, Oral）

Denoising Normalizing Flow

PyTorch implementation of Super SloMo by Jiang et al.

Learning to Reach Goals via Iterated Supervised Learning

Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

Efficient neural networks for analog audio effect modeling