Converting CPT to bert form for use

Last update: Oct 14, 2021

Related tags

Overview

cpt-encoder

将CPT转成bert形式使用

说明

刚刚刷到又出了一种模型：CPT，看论文显示，在很多中文任务上性能比mac bert还好，就迫不及待想把它用起来。

根据对源码的研究，发现该模型在做nlu建模时主要用的encoder部分，也就是bert，因此我将这部分权重转为bert权重类型，方便做nlu任务。

当然，想要发挥CPT的性能，还是得用官方代码用生成方式来使用，如prompt。

性能还未测试，第一个epoch看起来和roberta差不多。

加载方式

使用huggingface的transformers就可以加载，和BERT一样的方式。

转换代码

见 convert_cpt_to_bert.py

转好的权重地址

cpt-encoder-base: https://pan.baidu.com/s/1PqUAWNczX9vVcFtRHcE5cg 提取码：2fo2

cpt-encoder-large: https://pan.baidu.com/s/1KwumkF1NRL6wX7aifnq4xA 提取码：ke7o

官方地址

论文：CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

github：CPT

Reference

Converting CPT to bert form for use

Related tags

Overview

cpt-encoder

将CPT转成bert形式使用

说明

加载方式

转换代码

转好的权重地址

官方地址

Owner

黄辉

Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure

Using VapourSynth with super resolution models and speeding them up with TensorRT.

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

mmfewshot is an open source few shot learning toolbox based on PyTorch

ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

Neural implicit reconstruction experiments for the Vector Neuron paper

PyTorch implementation of "VRT: A Video Restoration Transformer"

Si Adek Keras is software VR dangerous object detection.

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Change Detection in SAR Images Based on Multiscale Capsule Network

retweet 4 satoshi ⚡️

Texture mapping with variational auto-encoders

SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

Facilitates implementing deep neural-network backbones, data augmentations

This repo contains the code required to train the multivariate time-series Transformer.

PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.