GPT, but made only out of gMLPs

Last update: Dec 01, 2022

Overview

GPT - gMLP

This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will contain a variant that does gMLP for local sliding windows. The hope is to be able to stretch a single GPU to be able to train context lengths of 4096 and above efficiently and well.

GPT is technically a misnomer now, since there will be no attention (transformer) at all contained in the architecture.

Install

$ pip install g-mlp-gpt

Usage

import torch
from g_mlp_gpt import gMLPGPT

model = gMLPGPT(
    num_tokens = 20000,
    dim = 512,
    depth = 4,
    seq_len = 1024,
    window_size = (128, 256, 512, 1024) # window sizes for each depth
)

x = torch.randint(0, 20000, (1, 1000))
logits = model(x) # (1, 1000, 20000)

16k context length

import torch
from g_mlp_gpt import gMLPGPT

model = gMLPGPT(
    num_tokens = 20000,
    dim = 512,
    seq_len = 16384,
    depth = 8,
    reversible = True,
    window = (128, 128, 256, 512, 1024, 1024, 2048, 2048, 4096, 4096, 8192, 8192),
    axial = (1, 1, 1, 1, 1, 1, 2, 2, 4, 4, 8, 8)
).cuda()

x = torch.randint(0, 20000, (1, 16384)).cuda()
logits = model(x) # (1, 16384, 20000)

Citations

@misc{liu2021pay,
    title   = {Pay Attention to MLPs}, 
    author  = {Hanxiao Liu and Zihang Dai and David R. So and Quoc V. Le},
    year    = {2021},
    eprint  = {2105.08050},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

AI-Bot 一个基于watermelon改造的OpenAI-GPT-2的智能机器人在Binder上直接运行测试目前有两种实现方式 TF2的GPT-2 TF

9 Nov 16, 2022

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors, I built Ellee - a robotic teddy bear who can move her head and converse naturally.

24 Oct 26, 2022

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

331 Jan 3, 2023

Simple, but essential Bayesian optimization package

BayesO: A Bayesian optimization framework in Python Simple, but essential Bayesian optimization package. http://bayeso.org Online documentation Instal

74 Dec 5, 2022

Like a cowsay but without cows!

Foxsay This is a simple program that generates pictures of a cute fox with a message. It is like a cowsay but without cows! Fox girls are better! Usag

28 Feb 20, 2022

Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

81 Nov 26, 2022

Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

39 Nov 30, 2022

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

49 Dec 22, 2022

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

RSG: A Simple but Effective Module for Learning Imbalanced Datasets (CVPR 2021) A Pytorch implementation of our CVPR 2021 paper "RSG: A Simple but Eff

120 Dec 12, 2022

GPT, but made only out of gMLPs

Related tags

Overview

GPT - gMLP

Install

Usage

Citations

You might also like...

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

Simple, but essential Bayesian optimization package

Like a cowsay but without cows!

Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Like Dirt-Samples, but cleaned up

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

Releases(0.0.15)

0.0.15(May 25, 2021)

0.0.14(May 25, 2021)

0.0.12(May 24, 2021)

0.0.11(May 23, 2021)

0.0.10(May 23, 2021)

0.0.9(May 21, 2021)

0.0.8(May 21, 2021)

0.0.7(May 21, 2021)

0.0.6(May 21, 2021)

0.0.5(May 20, 2021)

0.0.4(May 20, 2021)

0.0.3(May 20, 2021)

0.0.2(May 20, 2021)

0.0.1(May 20, 2021)

Owner

Phil Wang

PyTorch implementation of some learning rate schedulers for deep learning researcher.

Custom IMDB Dataset is extracted between 2020-2021 and custom distilBERT model is trained for movie success probability prediction

Code for models used in Bashiri et al., "A Flow-based latent state generative model of neural population responses to natural images".

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

3D-Transformer: Molecular Representation with Transformer in 3D Space

Cervix ROI Segmentation Using U-NET

Code for 1st place solution in Sleep AI Challenge SNU Hospital

Implicit Graph Neural Networks

A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

The end-to-end platform for building voice products at scale

A Pose Estimator for Dense Reconstruction with the Structured Light Illumination Sensor

Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

QuALITY: Question Answering with Long Input Texts, Yes!

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

Contrastive Learning of Structured World Models

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

验证码识别 深度学习 tensorflow 神经网络

Disentangled Lifespan Face Synthesis

Predictive Modeling on Electronic Health Records(EHR) using Pytorch

Official PyTorch implementation of "Adversarial Reciprocal Points Learning for Open Set Recognition"

验证码识别深度学习 tensorflow 神经网络