nlpcommon

nlpcommon, Python Text Tool. Python3开发。

Guide

Feature
Install
Usage
Dataset
Contact
Cite
Reference

Feature

nlpcommon is a python Open Source Toolkit for text classification. The goal is to implement text analysis algorithm, so as to achieve the use in the production environment.

nlpcommon has the characteristics of clear algorithm, high performance and customizable corpus.

Functions：

Classifier

Cluster

MiniBatchKmeans

While providing rich functions, nlpcommon internal modules adhere to low coupling, model adherence to inert loading, dictionary publication, and easy to use.

Install

Requirements and Installation

pip3 install nlpcommon

git clone https://github.com/shibing624/nlpcommon.git
cd nlpcommon
python3 setup.py install

Usage

data

Stopwrods

examples/base_demo.py:

import sys

sys.path.append('..')
from nlpcommon import stopwords

if __name__ == '__main__':
    print(len(stopwords), stopwords)

output:

2438 {'．', '大家', '孰知', '至于', './', '知道', '二话没说', '一何', '从宽', 'especially' ... }

Contact

Issue(建议)：
邮件我：xuming: [email protected]
微信我：加我微信号：xuming624, 进Python-NLP交流群，备注：姓名-公司名-NLP

Cite

如果你在研究中使用了nlpcommon，请按如下格式引用：

@software{nlpcommon,
  author = {Xu Ming},
  title = {nlpcommon: A Tool for Text NLP},
  year = {2021},
  url = {https://github.com/shibing624/nlpcommon},
}

License

授权协议为 The Apache License 2.0，可免费用做商业用途。请在产品说明中附加nlpcommon的链接和授权协议。

Contribute

项目代码还很粗糙，如果大家对代码有所改进，欢迎提交回本项目，在提交之前，注意以下两点：

在tests添加相应的单元测试
使用python setup.py test来运行所有单元测试，确保所有单测都是通过的

之后即可提交PR。

Reference

pytextclassifier

nlpcommon is a python Open Source Toolkit for text classification.

Related tags

Overview

nlpcommon

Feature

Classifier

Cluster

Install

Usage

data

Stopwrods

Contact

Cite

License

Contribute

Reference

Owner

xuming

DataCLUE: 国内首个以数据为中心的AI测评（含模型分析报告）

Transformer related optimization, including BERT, GPT

Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

Module for automatic summarization of text documents and HTML pages.

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Translation to python of Chris Sims' optimization function

Repository for the paper: VoiceMe: Personalized voice generation in TTS

Include MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

An assignment on creating a minimalist neural network toolkit for CS11-747

nlpcommon is a python Open Source Toolkit for text classification.

TTS is a library for advanced Text-to-Speech generation.

Input english text, then translate it between languages n times using the Deep Translator Python Library.

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Text preprocessing, representation and visualization from zero to hero.

This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs".

UniSpeech - Large Scale Self-Supervised Learning for Speech

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

Unofficial Python library for using the Polish Wordnet (plWordNet / Słowosieć)

State of the art faster Natural Language Processing in Tensorflow 2.0 .