Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Last update: Dec 29, 2022

Overview

N-Grammer - Pytorch

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Install

$ pip install n-grammer-pytorch

Usage

import torch
from n_grammer_pytorch import VQNgrammer

vq_ngram = VQNgrammer(
    num_clusters = 1024,             # number of clusters
    dim_per_head = 32,               # dimension per head
    num_heads = 16,                  # number of heads
    ngram_vocab_size = 768 * 256,    # ngram vocab size
    ngram_emb_dim = 16,              # ngram embedding dimension
    decay = 0.999                    # exponential moving decay value
)

x = torch.randn(1, 1024, 32 * 16)
vq_ngram(x) # (1, 1024, 32 * 16)

Learning Rates

Like product key memories, Ngrammer parameters need to have a higher learning rate (1e-2 was recommended in the paper). The repository offers an easy way to generate the parameter groups.

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_parameters

# this helper function, for your root model, finds all the VQNgrammer models and the embedding parameters
ngrammer_parameters, other_parameters = get_ngrammer_parameters(transformer)

optim = Adam([
    {'params': other_parameters},
    {'params': ngrammer_parameters, 'lr': 1e-2}
], lr = 3e-4)

Or, even more simply

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_param_groups

param_groups = get_ngrammer_param_groups(model) # automatically creates array of parameter settings with learning rate set at 1e-2 for ngrammer parameter values
optim = Adam(param_groups, lr = 3e-4)

Citations

@inproceedings{thai2020using,
    title   = {N-grammer: Augmenting Transformers with latent n-grams},
    author  = {Anonymous},
    year    = {2021},
    url     = {https://openreview.net/forum?id=GxjCYmQAody}
}

Transformers implementation for Fall 2021 Clinic

Installation Download miniconda3 if not already installed You can check by running typing conda in command prompt. Use conda to create an environment

1 Oct 28, 2021

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

ITTR - Pytorch Implementation of the Hybrid Perception Block (HPB) and Dual-Pruned Self-Attention (DPSA) block from the ITTR paper for Image to Image

17 Dec 23, 2022

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

141 Dec 30, 2022

KoBART model on huggingface transformers

KoBART-Transformers SKT에서 공개한 KoBART를 편리하게 사용할 수 있게 transformers로 포팅하였습니다. Install (Optional) BartModel과 PreTrainedTokenizerFast를 이용하면 설치하실 필요 없습니다. p

58 Dec 7, 2022

Big Bird: Transformers for Longer Sequences

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.

457 Dec 23, 2022

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran

6.4k Jan 9, 2023

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

1.2k Jan 8, 2023

spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

342 Nov 21, 2022

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

1.4k Feb 18, 2021

Comments

error when passing `concat_ngrams=False`
in this assert statement, the condition inside not(...) is actually the required condition

assert not (not concat_ngrams and dim_per_head == ngram_emb_dim), 'unigram head dimension must be equal to ngram embedding dimension if not concatting'

https://github.com/lucidrains/n-grammer-pytorch/blob/main/n_grammer_pytorch/n_grammer_pytorch.py#L149
opened by yiyixuxu 1

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Related tags

Overview

N-Grammer - Pytorch

Install

Usage

Learning Rates

Citations

You might also like...

Transformers implementation for Fall 2021 Clinic

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

KoBART model on huggingface transformers

Big Bird: Transformers for Longer Sequences

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spaCy plugin for Transformers , Udify, ELmo, etc.

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Comments

error when passing `concat_ngrams=False`

Releases(0.0.14a)

0.0.14a(Dec 4, 2022)

0.0.14(Dec 4, 2022)

0.0.12(Dec 4, 2021)

0.0.11(Dec 4, 2021)

0.0.10(Dec 4, 2021)

0.0.9(Dec 4, 2021)

0.0.8(Dec 4, 2021)

0.0.7(Dec 4, 2021)

0.0.6(Dec 3, 2021)

0.0.5(Dec 3, 2021)

0.0.4(Dec 3, 2021)

0.0.3(Dec 3, 2021)

0.0.2(Dec 3, 2021)

0.0.1a(Dec 3, 2021)

Owner

Phil Wang

Submit issues and feature requests for our API here.

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Machine translation models released by the Gourmet project

Fast, general, and tested differentiable structured prediction in PyTorch

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

초성 해석기 based on ko-BART

Asr abc - Automatic speech recognition(ASR),中文语音识别

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Journalism AI – Quotes extraction for modular journalism

Ecco is a python library for exploring and explaining Natural Language Processing models using interactive visualizations.

Maha is a text processing library specially developed to deal with Arabic text.

Mkdocs + material + cool stuff

Labelling platform for text using distant supervision

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

A high-level Python library for Quantum Natural Language Processing

fastai ulmfit - Pretraining the Language Model, Fine-Tuning and training a Classifier

Stanford CoreNLP provides a set of natural language analysis tools written in Java

PyTorch impelementations of BERT-based Spelling Error Correction Models.

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation