IOT: Instance-wise Layer Reordering for Transformer Structures

Related tags

Deep LearningIOT
Overview

Introduction

This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wise Layer Reordering for Transformer Structures.

If you find this work helpful in your research, please cite as:

@inproceedings{
zhu2021iot,
title={{\{}IOT{\}}: Instance-wise Layer Reordering for Transformer Structures},
author={Jinhua Zhu and Lijun Wu and Yingce Xia and Shufang Xie and Tao Qin and Wengang Zhou and Houqiang Li and Tie-Yan Liu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=ipUPfYxWZvM}
}

Requirements and Installation

  • PyTorch version == 1.0.0
  • Python version >= 3.5

To install IOT:

git clone https://github.com/instance-wise-ordered-transformer/IOT
cd IOT
pip install --editable .

Getting Started

Take IWSLT14 De-En translation as an example.

Data Preprocessing

cd examples/translation/
bash prepare-iwslt14.sh
cd ../..

TEXT=examples/translation/iwslt14.tokenized.de-en
python preprocess.py --source-lang de --target-lang en \
    --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
    --destdir data-bin/iwslt14.tokenized.de-en --joined-dictionary

Training

Encoder order is set to be the default one without reordering (ENCODER_MAX_ORDER=1), since the paper finds that both reordering encoder and decoder is not good as reordering decoder only.

#!/bin/bash
export CUDA_VISIBLE_DEVICES=${1:-0}
nvidia-smi

ENCODER_MAX_ORDER=1
DECODER_MAX_ORDER=3
DECODER_ORDER="0 3 5"
DIVERSITY=0.1
GS_MAX=20
GS_MIN=2
GS_R=0
GS_UF=5000
KL=0.01
CLAMPVAL=0.05

DECODER_ORDER_NAME=`echo $DECODER_ORDER | sed 's/ //g'`
SAVE_DIR=checkpoints/dec_${DECODER_MAX_ORDER}_order_${DECODER_ORDER_NAME}_div_${DIVERSITY}_gsmax_${GS_MAX}_gsmin_${GS_MIN}_gsr_${GS_R}_gsuf_${GS_UF}_kl_${KL}_clampval_${CLAMPVAL}
mkdir -p ${SAVE_DIR}

python -u train.py data-bin/iwslt14.tokenized.de-en -a transformer_iwslt_de_en \
--optimizer adam --lr 0.0005 -s de -t en --label-smoothing 0.1 --dropout 0.3 --max-tokens 4000 \
--min-lr 1e-09 --lr-scheduler inverse_sqrt --weight-decay 0.0001 --criterion label_smoothed_cross_entropy \
--max-update 100000 --warmup-updates 4000 --warmup-init-lr 1e-07 --adam-betas '(0.9,0.98)' \
--save-dir $SAVE_DIR --share-all-embeddings  --gs-clamp --decoder-orders $DECODER_ORDER  \
--encoder-max-order $ENCODER_MAX_ORDER  --decoder-max-order $DECODER_MAX_ORDER  --diversity $DIVERSITY \
--gumbel-softmax-max $GS_MAX  --gumbel-softmax-min $GS_MIN --gumbel-softmax-tau-r $GS_R  --gumbel-softmax-update-freq $GS_UF \
--kl $KL --clamp-value $CLAMPVAL | tee -a ${SAVE_DIR}/train.log

Evaluation

#!/bin/bash
set -x
set -e

pip install -e . --user
export CUDA_VISIBLE_DEVICES=${1:-0}
nvidia-smi

ENCODER_MAX_ORDER=1
DECODER_MAX_ORDER=3
DECODER_ORDER="0 3 5"
DIVERSITY=0.1
GS_MAX=20
GS_MIN=2
GS_R=0
GS_UF=5000
KL=0.01
CLAMPVAL=0.05

DECODER_ORDER_NAME=`echo $DECODER_ORDER | sed 's/ //g'`
SAVE_DIR=checkpoints/dec_${DECODER_MAX_ORDER}_order_${DECODER_ORDER_NAME}_div_${DIVERSITY}_gsmax_${GS_MAX}_gsmin_${GS_MIN}_gsr_${GS_R}_gsuf_${GS_UF}_kl_${KL}_clampval_${CLAMPVAL}

python generate.py data-bin/iwslt14.tokenized.de-en \
  --path $SAVE_DIR/checkpint_best.pt \
  --batch-size 128 --beam 5 --remove-bpe --quiet --num-ckts $DECODER_MAX_ORDER 
This is the official code release for the paper Shape and Material Capture at Home

This is the official code release for the paper Shape and Material Capture at Home. The code enables you to reconstruct a 3D mesh and Cook-Torrance BRDF from one or more images captured with a flashl

89 Dec 10, 2022
Algebraic effect handlers in Python

PyEffect: Algebraic effects in Python What IDK. Usage effects.handle(operation, handlers=None) effects.set_handler(effect, handler) Supported effects

Greg Werbin 5 Dec 27, 2021
OpenAi's gym environment wrapper to vectorize them with Ray

Ray Vector Environment Wrapper You would like to use Ray to vectorize your environment but you don't want to use RLLib ? You came to the right place !

Pierre TASSEL 15 Nov 10, 2022
A framework for annotating 3D meshes using the predictions of a 2D semantic segmentation model.

Semantic Meshes A framework for annotating 3D meshes using the predictions of a 2D semantic segmentation model. Paper If you find this framework usefu

Florian 40 Dec 09, 2022
Fast (simple) spectral synthesis and emission-line fitting of DESI spectra.

FastSpecFit Introduction This repository contains code and documentation to perform fast, simple spectral synthesis and emission-line fitting of DESI

5 Aug 02, 2022
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

NAVER AI 262 Dec 27, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022
Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

🍐 quince Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding 🍐 Installation $ git clone

Andrew Jesson 19 Jun 23, 2022
Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

PAWS-TF 🐾 Implementation of Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples (PAWS)

Sayak Paul 43 Jan 08, 2023
Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks This repository contains the code and pre-trained

Hassan Dbouk 7 Dec 05, 2022
An efficient framework for reinforcement learning.

rl: An efficient framework for reinforcement learning Requirements Introduction PPO Test Requirements name version Python =3.7 numpy =1.19 torch =1

16 Nov 30, 2022
Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems

Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems This is our experimental code for RecSys 2021 paper "Learning

11 Jul 28, 2022
This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

Akari Asai 59 Dec 28, 2022
La source de mon module 'pyfade' disponible sur Pypi.

Version: 1.2 Introduction Pyfade est un module permettant de créer des dégradés colorés. Il vous permettra de changer chaque ligne de votre texte par

Billy 20 Sep 12, 2021
PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Code for On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models This repository will reproduce the main results from our pape

Mitch Hill 32 Nov 25, 2022
For medical image segmentation

LeViT_UNet For medical image segmentation Our model is based on LeViT (https://github.com/facebookresearch/LeViT). You'd better gitclone its codes. Th

13 Dec 24, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
First-Order Probabilistic Programming Language

FOPPL: A First-Order Probabilistic Programming Language This is an implementation of FOPPL, an S-expression based probabilistic programming language d

Renato Costa 23 Dec 20, 2022
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

Microsoft 409 Jan 06, 2023
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur

2 Jan 09, 2022