Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

Overview

Time-aware Large Kernel (TaLK) Convolutions (Lioutas et al., 2020)

This repository contains the source code, pre-trained models, as well as instructions to reproduce results for our paper Time-aware Large Kernel Convolutions (ICML 2020).

TaLK Convolutions is a sequence modeling method that uses an adaptive convolution operation that learns to predict the size of a summation kernel instead of using a fixed-sized learnable kernel matrix. It utilizes a fast parallelized implementation of the summed-area table, also known as the integral image operation, to efficiently calculate the convolution output that uses the summation kernel. We generate relative offsets for each timestep of the input sequence, which are used to adaptively expand the size of the summation kernel conditioned on the input. This method yields a time complexity of O(n), effectively making the sequence encoding process linear to the number of tokens.

Video Presentation:

Time-aware Large Kernel Convolutions (ICML 2020)

Citation:

@inproceedings{lioutas2020timeaware,
    author={Vasileios Lioutas and Yuhong Guo},
    title={Time-aware Large Kernel Convolutions},
    booktitle={Proceedings of the 37th International Conference on Machine Learning (ICML)},
    year={2020}
}

Setup

Requirements

  • PyTorch version >= 1.3.1
  • fairseq version >= 0.10.1
  • Python version >= 3.6
  • CUDA >= 10.1
  • NVIDIA's apex library (for mixed-precision training)

Clone this repository

git clone https://github.com/lioutasb/TaLKConvolutions.git
cd TaLKConvolutions

Efficient CUDA Kernels

In order to support the parallelization of TaLK Convolutions, we have developed our own CUDA primitives. To install the kernels, use the commands below. We tested compiling the kernels using CUDA 10.1 but if a future CUDA release does not work, please feel free to open an issue.

cd talkconv/talkconv_module/
python setup.py install

We are welcoming contributions from experienced CUDA developers regarding making the CUDA kernels more efficient.

Translation

Pre-trained models

Dataset Model Prepared test set
IWSLT14 German-English download (.pt) IWSLT14 test: download (.zip)
WMT16 English-German download (.pt) newstest2014: download (.zip)
WMT14 English-French download (.pt) newstest2014: download (.zip)

Preprocessing the training datasets

Please follow the instructions https://github.com/pytorch/fairseq/blob/master/examples/translation/README.md to preprocess the data.

IWSLT14 De-En

Training and evaluating TaLK Convolutions on a single GPU:

# Training
SAVE="checkpoints/talkconv_iwslt_deen"
mkdir -p $SAVE

CUDA_VISIBLE_DEVICES=0 \
fairseq-train data-bin/iwslt14.tokenized.de-en \
    --user-dir talkconv/talkconv_fairseq \
    --arch talkconv_iwslt_de_en \
    --optimizer adam  --fp16 --lr 0.0005 \
    --source-lang de --target-lang en --max-tokens 4000 \
    --min-lr '1e-09' --weight-decay 0.0001 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --lr-scheduler inverse_sqrt \
    --dropout 0.3 --attention-dropout 0.1 --weight-dropout 0.1  \
    --max-update 85000 --warmup-updates 4000 --warmup-init-lr '1e-07' \
    --adam-betas '(0.9, 0.98)' --left-pad-source "False" --max-epoch 52 --seed 1024 \
    --save-dir $SAVE 

python utils/average_checkpoints.py --inputs $SAVE \
    --num-epoch-checkpoints 10 --output "${SAVE}/model.pt"

# Evaluation
fairseq-generate data-bin/iwslt14.tokenized.de-en --user-dir talkconv/talkconv_fairseq \
    --path "${SAVE}/model.pt" \
    --batch-size 128 --beam 5 --remove-bpe --lenpen 1.6 --gen-subset test --quiet 

WMT16 En-De

Training and evaluating TaLK Convolutions on WMT16 En-De using cosine scheduler on one machine with 8 NVIDIA GPUs:

# Training
SAVE="checkpoints/talkconv_wmt_ende_big"
mkdir -p $SAVE

python -m torch.distributed.launch --nproc_per_node 8 fairseq-train \
    data-bin/wmt16_en_de_bpe32k --fp16 --log-interval 100 --no-progress-bar --distributed-no-spawn \
    --user-dir talkconv/talkconv_fairseq \
    --max-update 30243 --share-all-embeddings --optimizer adam \
    --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --weight-decay 0.0 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --min-lr 1e-09 --update-freq 16 \
    --ddp-backend=no_c10d --max-tokens 3584 \
    --lr-scheduler cosine --warmup-init-lr 1e-7 --warmup-updates 10000 \
    --lr-shrink 1 --max-lr 0.001 --lr 1e-7 --min-lr 1e-9 --warmup-init-lr 1e-07 \
    --t-mult 1 --lr-period-updates 20000 \
    --arch talkconv_wmt_en_de_big \
    --save-dir $SAVE

# Checkpoint averaging
python utilss/average_checkpoints.py --inputs $SAVE \
    --num-epoch-checkpoints 10 --output "${SAVE}/model.pt"

# Evaluation on newstest2014
CUDA_VISIBLE_DEVICES=0 \
fairseq-generate data-bin/wmt16_en_de_bpe32k --user-dir talkconv/talkconv_fairseq \
  --path "${SAVE}/model.pt" \
  --batch-size 128 --beam 4 --remove-bpe --lenpen 0.35 --gen-subset test > wmt14_gen_ende.txt 

bash utils/compound_split_bleu.sh wmt14_gen_ende.txt 

WMT14 En-Fr

Training and evaluating TaLK Convolutions on WMT14 En-Fr using cosine scheduler on one machine with 8 NVIDIA GPUs:

# Training
SAVE="checkpoints/talkconv_wmt_enfr_big"
mkdir -p $SAVE
python -m torch.distributed.launch --nproc_per_node 8 fairseq-train \
    data-bin/wmt14_en_fr --fp16 --log-interval 100 --no-progress-bar --distributed-no-spawn \
    --user-dir talkconv/talkconv_fairseq \
    --max-update 80000 --share-all-embeddings --optimizer adam \
    --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --weight-decay 0.0 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --min-lr 1e-09 --update-freq 32 \
    --ddp-backend=no_c10d --max-tokens 1800 \
    --lr-scheduler cosine --warmup-init-lr 1e-7 --warmup-updates 10000 \
    --lr-shrink 1 --max-lr 0.001 --lr 1e-7 --min-lr 1e-9 --warmup-init-lr 1e-07 \
    --t-mult 1 --lr-period-updates 70000 \
    --arch talkconv_wmt_en_fr_big \
    --save-dir $SAVE

# Checkpoint averaging
python utils/average_checkpoints.py --inputs $SAVE \
    --num-epoch-checkpoints 10 --output "${SAVE}/model.pt"

# Evaluation
CUDA_VISIBLE_DEVICES=0 \
fairseq-generate data-bin/wmt14_en_fr --user-dir talkconv/talkconv_fairseq \
    --path "${SAVE}/model.pt" \
    --batch-size 128 --beam 6 --remove-bpe --lenpen 0.65 --gen-subset test --quiet 

License

This project is MIT-licensed. The license applies to the pre-trained models as well.

Owner
Vasileios Lioutas
PhD student at the University of British Columbia | M.Sc. in CS at Carleton University and ex-Machine Learning Researcher at Huawei Noah's Ark Lab
Vasileios Lioutas
MEDIALpy: MEDIcal Abbreviations Lookup in Python

A small python package that allows the user to look up common medical abbreviations.

Aberystwyth Systems Biology 7 Nov 09, 2022
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

UIS-RNN Overview This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm. UIS-RNN solves the problem of s

Google 1.4k Dec 28, 2022
SGMC: Spectral Graph Matrix Completion

SGMC: Spectral Graph Matrix Completion Code for AAAI21 paper "Scalable and Explainable 1-Bit Matrix Completion via Graph Signal Learning". Data Format

Chao Chen 8 Dec 12, 2022
Implementation of ProteinBERT in Pytorch

ProteinBERT - Pytorch (wip) Implementation of ProteinBERT in Pytorch. Original Repository Install $ pip install protein-bert-pytorch Usage import torc

Phil Wang 92 Dec 25, 2022
Basic yet complete Machine Learning pipeline for NLP tasks

Basic yet complete Machine Learning pipeline for NLP tasks This repository accompanies the article on building basic yet complete ML pipelines for sol

Ivan 20 Aug 22, 2022
OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters - where the final result looks like waves in the ocean.

SimBERT升级版(SimBERTv2)!

RoFormer-Sim RoFormer-Sim,又称SimBERTv2,是我们之前发布的SimBERT模型的升级版。 介绍 https://kexue.fm/archives/8454 训练 tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.6 下载

317 Dec 23, 2022
本插件是pcrjjc插件的重置版,可以独立于后端api运行

pcrjjc2 本插件是pcrjjc重置版,不需要使用其他后端api,但是需要自行配置客户端 本项目基于AGPL v3协议开源,由于项目特殊性,禁止基于本项目的任何商业行为 配置方法 环境需求:.net framework 4.5及以上 jre8 别忘了装jre8 别忘了装jre8 别忘了装jre8

132 Dec 26, 2022
ZUNIT - Toward Zero-Shot Unsupervised Image-to-Image Translation

ZUNIT Dependencies you can install all the dependencies by pip install -r requirements.txt Datasets Download CUB dataset. Unzip the birds.zip at ./da

Chen Yuanqi 9 Jun 24, 2022
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Antlr Project 13.6k Jan 05, 2023
Search-Engine - 📖 AI based search engine

Search Engine AI based search engine that was trained on 25000 samples, feel free to train on up to 1.2M sample from kaggle dataset, link below StackS

Vladislav Kruglikov 2 Nov 29, 2022
Code for hyperboloid embeddings for knowledge graph entities

Implementation for the papers: Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs, Nurendra Choudhary, Nikhil Rao,

30 Dec 10, 2022
lightweight, fast and robust columnar dataframe for data analytics with online update

streamdf Streamdf is a lightweight data frame library built on top of the dictionary of numpy array, developed for Kaggle's time-series code competiti

23 May 19, 2022
This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was

1 Nov 16, 2021
Russian words synonyms and antonyms

ru_synonyms Russian words synonyms and antonyms. Install pip install git+https://github.com/ahmados/rusynonyms.git Usage from ru_synonyms import Anto

sumekenov 7 Dec 14, 2022
A paper list of pre-trained language models (PLMs).

Large-scale pre-trained language models (PLMs) such as BERT and GPT have achieved great success and become a milestone in NLP.

RUCAIBox 124 Jan 02, 2023
A spaCy wrapper of OpenTapioca for named entity linking on Wikidata

spaCyOpenTapioca A spaCy wrapper of OpenTapioca for named entity linking on Wikidata. Table of contents Installation How to use Local OpenTapioca Vizu

Universitätsbibliothek Mannheim 80 Jan 03, 2023
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

Eliyar Eziz 2.3k Dec 29, 2022
Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine

Semantic search through Wikipedia with the Weaviate vector search engine Weaviate is an open source vector search engine with build-in vectorization a

SeMI Technologies 191 Dec 26, 2022