PyTorch implementation of Pay Attention to MLPs

Last update: Dec 13, 2022

Overview

gMLP

PyTorch implementation of Pay Attention to MLPs.

Quickstart

Clone this repository.

git clone https://github.com/jaketae/g-mlp.git

Navigate to the cloned directory. You can use the barebone gMLP model via

>>> from g_mlp import gMLP
>>> model = gMLP()

By default, the model comes with the following parameters:

gMLP(
    d_model=256,
    d_ffn=512,
    seq_len=256,
    num_layers=6,
)

Usage

The repository also contains gMLP models specifically for language modeling and image classification.

NLP

gMLPForLanguageModeling shares the same default parameters as gMLP, with num_tokens=10000 as an added parameter that represents the size of the token embedding table.

>>> from g_mlp import gMLPForLanguageModeling
>>> model = gMLPForLanguageModeling()
>>> tokens = torch.randint(0, 10000, (8, 256))
>>> model(tokens).shape
torch.Size([8, 256, 256])

Computer Vision

gMLPForImageClassification is a ViT-esque version of gMLP that includes a patch creating layer and a final classification head.

>>> from g_mlp import gMLPForImageClassification
>>> model = gMLPForImageClassification()
>>> images = torch.randn(8, 3, 256, 256)
>>> model(images).shape
torch.Size([8, 1000])

Summary

The authors of the paper present gMLP, an an attention-free all-MLP architecture based on spatial gating units. gMLP achieves parity with transformer models such as ViT and BERT on language and vision downstream tasks. The authors also show that gMLP scales with increased data and number of parameters, suggesting that self-attention is not a necessary component for designing performant models.

PyTorch implementation of Pay Attention to MLPs

Related tags

Overview

gMLP

Quickstart

Usage

NLP

Computer Vision

Summary

Resources

Owner

Jake Tae

All materials of Cassandra Event, Udyam'22

Adjust Decision Boundary for Class Imbalanced Learning

Experiments for Neural Flows paper

ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts

Unsupervised Representation Learning by Invariance Propagation

Online Multi-Granularity Distillation for GAN Compression (ICCV2021)

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

SMIS - Semantically Multi-modal Image Synthesis(CVPR 2020)

A curated list of awesome Machine Learning frameworks, libraries and software.

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Related resources for our EMNLP 2021 paper

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies

An open-source online reverse dictionary.

Doge-Prediction - Coding Club prediction ig

ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"