K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Last update: Nov 16, 2022

Related tags

Overview

Introduction

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce.

Installation

PyTorch version >= 1.5.0
Python version >= 3.6

git clone https://github.com/pytorch/fairseq.git
cd fairseq 
pip install --editable ./

Pre-training

prepare data for pre-training train.sh

export CUDA_VISIBLE_DEVICES=0,1,2,3

function join_by { local IFS="$1"; shift; echo "$*"; }
DATA_DIR=$(join_by : data/kplug/bin/part*)

USER_DIR=src
TOKENS_PER_SAMPLE=512
WARMUP_UPDATES=10000
PEAK_LR=0.0005
TOTAL_UPDATES=125000
#MAX_SENTENCES=8
MAX_SENTENCES=16
UPDATE_FREQ=16   # batch_size=update_freq*max_sentences*nGPU = 16*16*4 = 1024

SUB_TASK=mlm_clm_sentcls_segcls_titlegen 
## ablation task
#SUB_TASK=clm_sentcls_segcls_titlegen
#SUB_TASK=mlm_sentcls_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_segcls
#SUB_TASK=mlm_clm_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_titlegen

fairseq-train $DATA_DIR \
    --user-dir $USER_DIR \
    --task multitask_lm \
    --sub-task $SUB_TASK \
    --arch transformer_pretrain_base \
    --min-loss-scale=0.000001 \
    --sample-break-mode none \
    --tokens-per-sample $TOKENS_PER_SAMPLE \
    --criterion multitask_lm \
    --apply-bert-init \
    --max-source-positions 512 --max-target-positions 512 \
    --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
    --lr-scheduler polynomial_decay --lr $PEAK_LR \
    --warmup-updates $WARMUP_UPDATES --total-num-update $TOTAL_UPDATES \
    --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
    --max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ \
    --ddp-backend=no_c10d \
    --tensorboard-logdir tensorboard \
    --classification-head-name pretrain_head --num-classes 40 \
    --tagging-head-name pretrain_tag_head --tag-num-classes 2 \
    --fp16

Fine-tuning and Inference

Finetuning on JDDC (Response Generation)

Finetuning on ECD Corpus (Response Retrieval)

Finetuning on JD Product Dataset (Abstractive Summarization)

Finetuning on MEPAVE Dataset (Sequence Tagging)

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Related tags

Overview

Introduction

Installation

Pre-training

Fine-tuning and Inference

Owner

Xu Song

A few stylization coreML models that I've trained with CreateML

Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Face uncertainty quantification or estimation using PyTorch.

A geometric deep learning pipeline for predicting protein interface contacts.

Deep Learning for Time Series Classification

This is the code of NeurIPS'21 paper "Towards Enabling Meta-Learning from Target Models".

Conflict-aware Inference of Python Compatible Runtime Environments with Domain Knowledge Graph, ICSE 2022

Official code for our EMNLP2021 Outstanding Paper MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks

This repository implements variational graph auto encoder by Thomas Kipf.

OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Read number plates with https://platerecognizer.com/

Mini Software that give reminder to drink water as per your weight.

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

A Flow-based Generative Network for Speech Synthesis

RoMa: A lightweight library to deal with 3D rotations in PyTorch.

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

Code for our work "Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection".

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Multi-Scale Progressive Fusion Network for Single Image Deraining