Use Tensorflow2.7.0 Build OpenAI'GPT-2

Last update: Sep 13, 2022

Overview

TF2_GPT-2

Use Tensorflow2.7.0 Build OpenAI'GPT-2 使用最新tensorflow2.7.0构建openai官方的GPT-2 NLP模型

优点

使用无监督技术
拥有大量词汇量
可实现续写（堪比“xx梦续写”）
实现对话后续将应用于FloatTech的Bot

食用方法

Setting

python >= 3.6
numpy==1.16.4
sentencepiece==0.1.83
tensorflow-gpu==2.7.0

Steps

1. git clone https://github.com/Xhs753/TF2_GPT-2
2. $ cd TF2_GPT-2
3. $ pip install -r requirments.txt

你可以使用词仓库提供的sample.py示例数据预训练模型 #####　对仓库的可用数据进行训练模型

$ pyton pre_process.py --help

可选项：
  --data-dir TEXT        训练数据路径  [默认: /data/scraped]
  --vocab-size INTEGER   词汇大小和字节大小  [默认: 24512]
  --min-seq-len INTEGER  最小词序长度  [默认: 15]
  --max-seq-len INTEGER  最大词序sequence长度  [默认: 512]
  --help                 显示所有信息并退出
  
  
 ==>>python pre_process.py

在任意数据上训练

>> python pre_process.py --data-dir=data_directory --vocab-size=32000

有关模型的命令源码在此

@click.command()
@click.option('--num-layers', type=int, default=8, show_default=True, help="No. of decoder layers")
@click.option('--embedding-size', type=int, default=768, show_default=True, help="Embedding size")
@click.option('--num-heads', type=int, default=8, show_default=True, help="Number of heads")
@click.option('--dff', type=int, default=3072, show_default=True, help="Filter Size")
@click.option('--max-seq-len', type=int, default=515, show_default=True, help="Seq length")
@click.option('--vocab-size', type=int, default=24512, show_default=True, help="Vocab size")
@click.option('--optimizer', type=str, default="adam", show_default=True, help="optimizer type")
@click.option('--batch-size', type=int, default=8, show_default=True, help="optimizer type")
@click.option('--learning-rate', type=float, default=0.001, show_default=True, help="learning rate")
@click.option('--graph-mode', type=bool, default=False, show_default=False, help="TF run mode")
@click.option('--distributed', type=bool, default=False, show_default=False, help="distributed training")

####### 使用GPT-2

>> python train_gpt2.py \
  --num-layers=8 \
  --num-heads=8 \
  --dff=3072 \
  --embedding-size=768 \
  --batch-size=32 \
  --learning-rate=5e-5
  --graph-mode=True

模型架构

Link

OpenAi-GPT-2

Thanks To My Friends

LICENCE

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

1.6k Dec 27, 2022

When doing audio and video sentiment recognition, I found that a lot of code is duplicated, often a function in different time debugging for a long time, based on this problem, I want to manage all the previous work, organized into an open source library can be iterative. For their own use and others.

FastAudioVisual Our project is developed here. The goal finish time is March 01, 2021 What is FastAudioVisual? FastAudioVisual is a tool that allows u

39 Oct 27, 2022

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

EasyNMT - Easy to use, state-of-the-art Neural Machine Translation This package provides easy to use, state-of-the-art machine translation for more th

748 Jan 6, 2023

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

1.2k Jan 8, 2023

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

1.5k Feb 11, 2021

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

903 Feb 17, 2021

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

NeuroNER NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to insta

1.5k Feb 17, 2021

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

tfds-korean A collection of Korean Text Datasets ready to use using Tensorflow-Datasets. TensorFlow-Datasets를 이용한 한국어/한글 데이터셋 모음입니다. Dataset Catalog |

20 Jul 11, 2022

Releases(GPT-2)

GPT-2(Mar 16, 2022)

新增测试版的BNNGPT贝叶斯神经网络GPT-2模型

适用于ＣＮＧＰＴ的预训练模型

|下载预训练模型 |------------------ | DOWNLOAD 370MB-- 本仓库的datas/train.txt数据集 | DOWNLOAD 2.43GB --Tang.txt

适用于iGPT的ONNX预训练模型

|ONNX模型(iGPT) |----------------- | DOWNLOAD 43.93MB

下载此项目所用到的机器学习包（适用于离线安装）

DOWNLOAD
Source code(tar.gz)
Source code(zip)
iGPT.onnx(43.90 MB)
Tang.txt(8.38 MB)

Use Tensorflow2.7.0 Build OpenAI'GPT-2

Related tags

Overview

TF2_GPT-2

优点

食用方法

Setting

Steps

在任意数据上训练

模型架构

Link

Thanks To My Friends

LICENCE

You might also like...

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

Releases(GPT-2)

GPT-2(Mar 16, 2022)

新增测试版的BNNGPT贝叶斯神经网络GPT-2模型

适用于ＣＮＧＰＴ的预训练模型

适用于iGPT的ONNX预训练模型

下载此项目所用到的机器学习包（适用于离线安装）

Owner

Watermelon

A library for end-to-end learning of embedding index and retrieval model

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

🌐 Translation microservice powered by AI

Implementation of Multistream Transformers in Pytorch

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

To be a next-generation DL-based phenotype prediction from genome mutations.

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

Kestrel Threat Hunting Language

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Google and Stanford University released a new pre-trained model called ELECTRA

Kurumi ChatBot

Large-scale pretraining for dialogue

Code examples for my Write Better Python Code series on YouTube.

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

An extension for asreview implements a version of the tf-idf feature extractor that saves the matrix and the vocabulary.

Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

A Chinese to English Neural Model Translation Project

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities