ParaGen

ParaGen is a PyTorch deep learning framework for parallel sequence generation. Apart from sequence generation, ParaGen also enhances various NLP tasks, including sequence-level classification, extraction and generation.

Requirements and Installation

Install third-party dependent package:

apt-get install libopenmpi-dev,libssl-dev,openssh-server

To install ParaGen from source:

cd ParaGen
pip install -e .

For distributed training, you need to make sure horovod has been installed.

# require CMake to install horovod. (https://cmake.org/install/)
pip install horovod

Install lightseq to faster train:

pip install lightseq

Getting Started

Before using ParaGen, it would be helpful to overview how ParaGen works.

ParaGen is designed as a task-oriented framework, where task is regarded as the core of all the codes. A specific task selects all the components for support itself, such as model architectures, training strategies, dataset, and data processing. Any component within ParaGen can be customized, while the existing modules and methods are used as a plug-in library.

As tasks are considered as the core of ParaGen, it works with various modes, such as train, evaluate, preprocess and serve. Tasks act differently under different modes, by reorganizing the components without code modification.

Please refer to examples for detailed instructions.

ParaGen Usage and Contribution

We welcome any experimental algorithms on ParaGen.

Install ParaGen;
Create your own paragen-plugin libraries under third_party;
Experiment your own algorithms;
Write a reproducible shell script;
Create a merge request and assign reviewers to any of us.

ParaGen is a PyTorch deep learning framework for parallel sequence generation

Related tags

Overview

ParaGen

Requirements and Installation

Getting Started

ParaGen Usage and Contribution

Owner

Bytedance Inc.

Powerful and efficient Computer Vision Annotation Tool (CVAT)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Simple node deletion tool for onnx.

Implementation of PersonaGPT Dialog Model

OpenLT: An open-source project for long-tail classification

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Pairwise learning neural link prediction for ogb link prediction

Invariant Causal Prediction for Block MDPs

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Multi-objective constrained optimization for energy applications via tree ensembles

In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

Transfer Learning Remote Sensing

The official github repository for Towards Continual Knowledge Learning of Language Models

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Learning Off-Policy with Online Planning, CoRL 2021

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)