Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

Last update: Nov 06, 2022

Overview

Memformer - Pytorch

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch. It includes memory slots, which are updated with attention, learned efficiently through Memory-Replay BackPropagation (MRBP) through time.

Install

$ pip install memformer

Usage

Full encoder / decoder, as in the paper

import torch
from memformer import Memformer

model = Memformer(
    dim = 512,
    enc_num_tokens = 256,
    enc_depth = 2,
    enc_heads = 8,
    enc_max_seq_len = 1024,
    dec_num_tokens = 256,
    dec_depth = 2,
    dec_heads = 8,
    dec_max_seq_len = 1024,
    num_memory_slots = 128
)

src_seg_1 = torch.randint(0, 256, (1, 1024))
src_seg_2 = torch.randint(0, 256, (1, 1024))
src_seg_3 = torch.randint(0, 256, (1, 1024))

tgt = torch.randint(0, 256, (1, 1024))

enc_out1, mems1,    _ = model(src_seg_1) # (1, 1024, 512), (1, 128, 512), _
enc_out2, mems2,    _ = model(src_seg_2, mems = mems1)
enc_out3, mems3, loss = model(src_seg_3, tgt, mems = mems2)

loss.backward()

Encoder only

import torch
from memformer import Memformer

model = Memformer(
    dim = 512,
    enc_num_tokens = 256,
    enc_heads = 8,
    enc_depth = 2,
    enc_max_seq_len = 1024,
    num_memory_slots = 128,
    num_mem_updates = 2,
    encoder_only = True       # only use encoder, in which output is encoded output
)

src1 = torch.randint(0, 256, (1, 1024))
src2 = torch.randint(0, 256, (1, 1024))

enc1, mems1 = model(src1) # (1, 1024, 512), (1, 128, 512)
enc2, mems2 = model(src2, mems = mems1)

Memory Replay Back-Propagation

import torch
from memformer import Memformer, memory_replay_backprop

model = Memformer(
    dim = 512,
    num_memory_slots = 128,
    enc_num_tokens = 256,
    enc_depth = 2,
    enc_max_seq_len = 1024,
    dec_num_tokens = 256,
    dec_depth = 2,
    dec_max_seq_len = 1024
).cuda()

seq = torch.randint(0, 256, (1, 8192)).cuda()
seq_mask = torch.ones_like(seq).bool().cuda()

tgt = torch.randint(0, 256, (1, 512)).cuda()
tgt_mask = torch.ones_like(tgt).bool().cuda()

# will automatically split the source sequence to 8 segments
memory_replay_backprop(
    model,
    src = seq,
    tgt = tgt,
    src_mask = seq_mask,
    tgt_mask = tgt_mask
)

Citations

@inproceedings{
    anonymous2021memformer,
    title={Memformer: The Memory-Augmented Transformer},
    author={Anonymous},
    booktitle={Submitted to International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=_adSMszz_g9},
    note={under review}
}

You might also like...

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

Styled Augmented Translation

SAT Style Augmented Translation Introduction By collecting high-quality data, we were able to train a model that outperforms Google Translate on 6 dif

139 Dec 29, 2022

TANL: Structured Prediction as Translation between Augmented Natural Languages

TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen

98 Dec 15, 2022

A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Brain Augmented Reality (AR) A neuroanatomy-based augmented reality experience powered by computer vision that features 3D visuals of the Atlas Brain

10 Oct 6, 2022

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC

49 Nov 28, 2022

A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

58 Dec 17, 2022

DrQ-v2: Improved Data-Augmented Reinforcement Learning

DrQ-v2: Improved Data-Augmented RL Agent Method DrQ-v2 is a model-free off-policy algorithm for image-based continuous control. DrQ-v2 builds on DrQ,

234 Jan 1, 2023

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

60 Dec 30, 2022

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering Authors: Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou and

72 Dec 5, 2022

Comments

WIP - MemformerEncoder
I´m always trying all your awesome work on transformers. My problem is NER on very large texts, with few examples.

Memformer is the first one so far to converge faster and wield better accuracy than RNN encoders as LSTM, SRU and IndRNN It is ridiculously better than everything else I tested, congratulations @lucidrains 🥳

I need to use the transformer as a Encoder in my pipeline, to feed a CRF layer. So I modified the code to accept an already embedded input, and to only do the Encode step.

TODO:

[ ] Support Mask

[ ] Re-utilize code with Memformer class

Is this within the scope of the project?
opened by bratao 10
ETA on complete examples

@lucidrains As I asked about the feedback-transformer, I was also wondering about this memformer implementation as I would love to try it. Any eta on any complete examples here? They will be much appreciated. Thanks.

And similarly, I would love to see a simple example for custom line-by-line TXT datasets as well.

Thank you again :)

opened by asigalov61 0

Releases(0.3.1)

0.3.1(Nov 13, 2020)

Source code(tar.gz)
Source code(zip)
0.3.0(Nov 6, 2020)

Source code(tar.gz)
Source code(zip)
0.2.1(Nov 3, 2020)

Source code(tar.gz)
Source code(zip)
0.2.0(Nov 3, 2020)

Source code(tar.gz)
Source code(zip)
0.1.1(Nov 3, 2020)

Source code(tar.gz)
Source code(zip)
0.1.0(Nov 2, 2020)

Source code(tar.gz)
Source code(zip)
0.0.9(Oct 31, 2020)

Source code(tar.gz)
Source code(zip)
0.0.8(Oct 30, 2020)

Source code(tar.gz)
Source code(zip)
0.0.7(Oct 30, 2020)

Source code(tar.gz)
Source code(zip)
0.0.6(Oct 30, 2020)

Source code(tar.gz)
Source code(zip)
0.0.5(Oct 29, 2020)

Source code(tar.gz)
Source code(zip)
0.0.4(Oct 29, 2020)

Source code(tar.gz)
Source code(zip)
0.0.3(Oct 29, 2020)

Source code(tar.gz)
Source code(zip)
0.0.2(Oct 28, 2020)

Source code(tar.gz)
Source code(zip)
0.0.1(Oct 28, 2020)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need

GitHub Repository

Deep Learning Pipelines for Apache Spark

Deep Learning Pipelines for Apache Spark The repo only contains HorovodRunner code for local CI and API docs. To use HorovodRunner for distributed tra

2k Jan 08, 2023

YOLOPのPythonでのONNX推論サンプル

YOLOP-ONNX-Video-Inference-Sample YOLOPのPythonでのONNX推論サンプルです。 ONNXモデルは、hustvl/YOLOP/weights を使用しています。 Requirement OpenCV 3.4.2 or later onnxruntime 1.

8 Sep 05, 2022

This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

Stock Trading Market OpenAI Gym Environment with Deep Reinforcement Learning using Keras Overview This project provides a general environment for stoc

769 Dec 25, 2022

Unofficial implementation of Fast-SCNN: Fast Semantic Segmentation Network

Fast-SCNN: Fast Semantic Segmentation Network Unofficial implementation of the model architecture of Fast-SCNN. Real-time Semantic Segmentation and mo

69 Aug 11, 2022

The codes I made while I practiced various TensorFlow examples

TensorFlow_Exercises The codes I made while I practiced various TensorFlow examples About the codes I didn't create these codes by myself, but re-crea

614 Dec 08, 2022

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

This is a PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR), using subpixel convolution to optimize the inference speed of TecoGAN VSR model. Please refer to the offi

789 Jan 04, 2023

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

291 Dec 24, 2022

A modern pure-Python library for reading PDF files

pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and

6 Apr 06, 2022

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

train-CLIP 📎 A PyTorch Lightning solution to training CLIP from scratch. Goal ⚽ Our aim is to create an easy to use Lightning implementation of OpenA

396 Dec 30, 2022

Source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree.

self-driving-car In this repository I will share the source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree. Hope this might

2.4k Dec 29, 2022

Transfer Learning Shootout for PyTorch's model zoo (torchvision)

pytorch-retraining Transfer Learning shootout for PyTorch's model zoo (torchvision). Load any pretrained model with custom final layer (num_classes) f

169 Jun 29, 2022

Lightweight Python library for adding real-time object tracking to any detector.

Norfair is a customizable lightweight Python library for real-time 2D object tracking. Using Norfair, you can add tracking capabilities to any detecto

1.7k Jan 05, 2023

本步态识别系统主要基于GaitSet模型进行实现

本步态识别系统主要基于GaitSet模型进行实现。在尝试部署本系统之前，建立理解GaitSet模型的网络结构、训练和推理方法。系统的实现效果如视频所示：演示视频由于模型较大，部分模型文件存储在百度云盘。链接提取码：33mb 具体部署过程 1.下载代码 2.安装requirements.txt

16 Oct 22, 2022

BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

260 Dec 06, 2022

Data augmentation for NLP, accepted at EMNLP 2021 Findings

AEDA: An Easier Data Augmentation Technique for Text Classification This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Techni

81 Dec 09, 2022

TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.

TensorFlow GNN This is an early (alpha) release to get community feedback. It's under active development and we may break API compatibility in the fut

889 Dec 30, 2022

TensorFlow implementation of "Learning from Simulated and Unsupervised Images through Adversarial Training"

Simulated+Unsupervised (S+U) Learning in TensorFlow TensorFlow implementation of Learning from Simulated and Unsupervised Images through Adversarial T

569 Dec 29, 2022

AI grand challenge 2020 Repo (Speech Recognition Track)

KorBERT를 활용한 한국어 텍스트 기반 위협 상황인지(2020 인공지능 그랜드 챌린지) 본 프로젝트는 ETRI에서 제공된 한국어 korBERT 모델을 활용하여 폭력 기반 한국어 텍스트를 분류하는 다양한 분류 모델들을 제공합니다. 본 개발자들이 참여한 2020 인공지

23 Jan 25, 2022

Survival analysis (SA) is a well-known statistical technique for the study of temporal events.

DAGSurv Survival analysis (SA) is a well-known statistical technique for the study of temporal events. In SA, time-to-an-event data is modeled using a

1 Sep 05, 2022

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

10.6k Jan 04, 2023

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

Related tags

Overview

Memformer - Pytorch

Install

Usage

Citations

You might also like...

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Styled Augmented Translation

TANL: Structured Prediction as Translation between Augmented Natural Languages

A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

DrQ-v2: Improved Data-Augmented Reinforcement Learning

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Comments

WIP - MemformerEncoder

ETA on complete examples

Releases(0.3.1)

0.3.1(Nov 13, 2020)

0.3.0(Nov 6, 2020)

0.2.1(Nov 3, 2020)

0.2.0(Nov 3, 2020)

0.1.1(Nov 3, 2020)

0.1.0(Nov 2, 2020)

0.0.9(Oct 31, 2020)

0.0.8(Oct 30, 2020)

0.0.7(Oct 30, 2020)

0.0.6(Oct 30, 2020)

0.0.5(Oct 29, 2020)

0.0.4(Oct 29, 2020)

0.0.3(Oct 29, 2020)

0.0.2(Oct 28, 2020)

0.0.1(Oct 28, 2020)