Improving Non-autoregressive Generation with Mixup Training

Related tags

Deep LearningMIST
Overview

MIST

Training MIST

TRAIN_FILE=/your/path/to/train.json
VALID_FILE=/your/path/to/valid.json
OUTPUT_DIR=/your/path/to/save_checkpoints
CACHE_DIR=/your/path/to/transformer_package_cache

MODEL_PATH=bert-base-uncased or models/unilm1.2-base-uncased

# squadqg 30005 steps
# squadqg 50005 steps
# xsum 600005 steps
STEPS=30005

python -m torch.distributed.launch --nproc_per_node=4 train.py\
  --train_file $TRAIN_FILE\
  --valid_file $VALID_FILE\
  --output_dir $OUTPUT_PATH\
  --model_type nat --model_name_or_path $MODEL_PATH\
  --do_lower_case --max_source_seq_length 464 --max_target_seq_length 48\
  --per_gpu_train_batch_size 16 --gradient_accumulation_steps 1\
  --learning_rate 3e-5 --num_warmup_steps 500 --num_training_steps $STEPS\
  --cache_dir $CACHE_DIR\
  --log_dir ${OUTPUT_PATH}/log\
  --keep_prob 0.0\
  --random_prob 0.0\
  --use_glat\
  --tqdm_miniters 100\
  --cotrain_put_target_in_source\ 
  --cotrain_put_target_in_source_same_bert\ 
  --wandb\ # logging with wandb
  --fp16\
  --fp16_opt_level O2

Removing the cotrain_put_target_in_source and cotrain_put_target_in_source_same_bert flags to reproduce the results without MIST.

Download Unilm

mkdir -p models/unilm1.2-base-uncased
cd models/unilm1.2-base-uncased
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased.bin -O pytorch_model.bin
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-vocab.txt -O vocab.txt
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-config.json -O config.json

Download datasets

Json dataset links: squadqg, xsum and quora

Training NAT MASS

To reproduce the results of NAT MASS, please refer to the ./MASS-NAT/mass-nat.sh

[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

NeRFlow [ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing Datasets The pouring dataset used for experiments can be download he

44 Dec 20, 2022
PyTorch implementation of some learning rate schedulers for deep learning researcher.

pytorch-lr-scheduler PyTorch implementation of some learning rate schedulers for deep learning researcher. Usage WarmupReduceLROnPlateauScheduler Visu

Soohwan Kim 59 Dec 08, 2022
Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Outlier Exposure This repository contains the essential code for the paper Deep Anomaly Detection with Outlier Exposure (ICLR 2019). Requires Python 3

Dan Hendrycks 464 Dec 27, 2022
Graph Self-Supervised Learning for Optoelectronic Properties of Organic Semiconductors

SSL_OSC Graph Self-Supervised Learning for Optoelectronic Properties of Organic Semiconductors

zaixizhang 2 May 14, 2022
A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners

Masked Autoencoders Are Scalable Vision Learners A TensorFlow implementation of Masked Autoencoders Are Scalable Vision Learners [1]. Our implementati

Aritra Roy Gosthipaty 59 Dec 10, 2022
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
RefineMask (CVPR 2021)

RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features (CVPR 2021) This repo is the official implementation of RefineMask:

Gang Zhang 191 Jan 07, 2023
CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

CARLA - Counterfactual And Recourse Library CARLA is a python library to benchmark counterfactual explanation and recourse models. It comes out-of-the

Carla Recourse 200 Dec 28, 2022
22 Oct 14, 2022
The official implementation of Theme Transformer

Theme Transformer This is the official implementation of Theme Transformer. Checkout our demo and paper : Demo | arXiv Environment: using python versi

Ian Shih 85 Dec 08, 2022
Python implementation of a live deep learning based age/gender/expression recognizer

TUT live age estimator Python implementation of a live deep learning based age/gender/smile/celebrity twin recognizer. All components use convolutiona

Heikki Huttunen 80 Nov 21, 2022
Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

FL Analysis This repository contains the code and results for the paper "Towards Understanding Quality Challenges of the Federated Learning: A First L

3 Oct 17, 2022
PyTorch implementation of Barlow Twins.

Barlow Twins: Self-Supervised Learning via Redundancy Reduction PyTorch implementation of Barlow Twins. @article{zbontar2021barlow, title={Barlow Tw

Facebook Research 839 Dec 29, 2022
IMBENS: class-imbalanced ensemble learning in Python.

IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [知乎/Zhihu] [中文README] [a

Zhining Liu 176 Jan 04, 2023
Code for the paper "Curriculum Dropout", ICCV 2017

Curriculum Dropout Dropout is a very effective way of regularizing neural networks. Stochastically "dropping out" units with a certain probability dis

Pietro Morerio 21 Jan 02, 2022
Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

1xN Pattern for Pruning Convolutional Neural Networks (paper) . This is Pytorch re-implementation of "1xN Pattern for Pruning Convolutional Neural Net

Mingbao Lin (林明宝) 29 Nov 29, 2022
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish legal domain Language Model ⚖️ This repository contains the page for two main resources for the Spanish legal domain: A RoBERTa model: https:/

Plan de Tecnologías del Lenguaje - Gobierno de España 12 Nov 14, 2022
An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

AlphaCare Background AlphaCare is a work-in-progress, open-source Deep Learning Engine for Healthcare that aims to treat and prevent major diseases. T

Siraj Raval 44 Nov 05, 2022
This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Self-Diagnosis and Self-Debiasing This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based

Timo Schick 62 Dec 12, 2022
Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

smart_edu-autobooking Sistema di autoprenotazione per l'aula studio [email protected]

Davide Carnemolla 17 Jun 20, 2022