The code for two papers: Feedback Transformer and Expire-Span.

Last update: Dec 25, 2022

Related tags

Deep Learning transformer-sequential

Overview

transformer-sequential

This repo contains the code for two papers:

Feedback Transformer
Expire-Span

The training code is structured for long sequential modeling with Transformer-like architectures.

Requirements

You will need a CUDA-enabled GPU to run the code.

Setup

Run the following:

pip install -r requirements.txt

Feedback Transformer

Introduced in Addressing Some Limitations of Transformers with Feedback Memory.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Feedback Transformer	77M	0.984	0.962

Numbers are Bits-Per-Character

bash experiments/feedback/enwik8.sh

Algorithmic

Model	3 Variable	5 Variable
Transformer	33.7	37.5
Feedback Transformer	99.1	92.6

Numbers are % Accuracy on Test

bash experiments/feedback/algorithmic_3var.sh
bash experiments/feedback/algorithmic_5var.sh

Expire-Span

Introduced in Not All Memories are Created Equal: Learning to Expire.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Expire-Span 12L	38M	1.014	0.994

Numbers are Bits-Per-Character

bash experiments/expire_span/enwik8.sh

Object Collision

Model	Maximum Span	Test Error (%)
Expire-Span	16k	52.2
Expire-Span	32k	36.7
Expire-Span	64k	26.7

bash experiments/expire_span/object_collision_16k.sh
bash experiments/expire_span/object_collision_32k.sh
bash experiments/expire_span/object_collision_64k.sh

License

The code is licensed under CC-BY-NC license. See the LICENSE file for more details.

The code for two papers: Feedback Transformer and Expire-Span.

Related tags

Overview

transformer-sequential

Requirements

Setup

Feedback Transformer

Running Experiments from the Paper

enwik8

Algorithmic

Expire-Span

Running Experiments from the Paper

enwik8

Object Collision

License

Owner

Facebook Research

A booklet on machine learning systems design with exercises

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

A whale detector design for the Kaggle whale-detector challenge!

Fast Neural Style for Image Style Transform by Pytorch

Deep Video Matting via Spatio-Temporal Alignment and Aggregation [CVPR2021]

《Train in Germany, Test in The USA: Making 3D Object Detectors Generalize》(CVPR 2020)

A hue shift helper for OBS

official code for dynamic convolution decomposition

A playable implementation of Fully Convolutional Networks with Keras.

Rational Activation Functions - Replacing Padé Activation Units

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Code to train models from "Paraphrastic Representations at Scale".

Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"

MultiTaskLearning - Multi Task Learning for 3D segmentation

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.