Wrangl
Ray-based parallel data preprocessing for NLP and ML.
pip install wrangl
# for latest
pip install git+https://github.com/vzhong/wrangl
See examples for usage.
Ray-based parallel data preprocessing for NLP and ML.
pip install wrangl
# for latest
pip install git+https://github.com/vzhong/wrangl
See examples for usage.
🤗 Transformers Wav2Vec2 + Parlance's CTCDecode Introduction This repo shows how 🤗 Transformers can be used in combination with Parlance's ctcdecode
NLP-Pytorch-Assignment An assignment from my grad-level data mining course (before I started personal projects) demonstrating some experience with NLP
Image captioning End-to-end image captioning with EfficientNet-b3 + LSTM with Attention Model is seq2seq model. In the encoder pretrained EfficientNet
Paddlespeech-Streaming-ASR-GUI Introduction A paddlespeech Streaming ASR GUI. Us
The implementation of paper CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval. CLIP4Clip is a video-text retrieval model based
BiLSTM-CRF for Name Entity Recognition PyTorch version A PyTorch implemention of Bi-LSTM-CRF model for Chinese Named Entity Recognition. 使用 PyTorch 实现
RewriteNAT This repo provides the code for reproducing our proposed RewriteNAT in EMNLP 2021 paper entitled "Learning to Rewrite for Non-Autoregressiv
Phone Level Mixture Density Network for TTS This repo contains pytorch implementation of paper Rich Prosody Diversity Modelling with Phone-level Mixtu
Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac
Coreferee Author: Richard Paul Hudson, msg systems ag 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 German 1.2.3 Polish 1
trstop Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with th
项目的背景 SinglepassTextCluster, an TextCluster tool based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for individ
AAGCN-ACSA EMNLP 2021 Introduction This repository was used in our paper: Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment An
Medical natural language parsing and utility library A natural language medical
Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.
This repository provides a script and recipe to run the highly optimized transformer-based encoder and decoder component, and it is tested and maintained by NVIDIA.
Description Minimal graphical application for accessing the Watson Text to Speech service. Requirements Python 3 plus all dependencies listed in requi
NLP Classifier Introduction This project trains a bert model on any NLP classifcation model. And uses the model in make predictions on new data using
A PyTorch implementation of unsupervised SimCSE SimCSE: Simple Contrastive Learning of Sentence Embeddings 1. 用法 无监督训练 python train_unsup.py ./data/ne
UToken - Tokens seguros. UToken (ou Unhandleable Token) é uma bilioteca criada para ser utilizada na geração de tokens seguros e íntegros, ou seja, nã