DocuMiner
A production-ready pipeline for text mining and subject indexing
Want to Contribute?
More code and documentation coming soon.
Authors
Open Source Club
A production-ready pipeline for text mining and subject indexing
More code and documentation coming soon.
Open Source Club
PyMultiDictionary PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different
afasi Fuzz a language by mixing up only few words. Status Beta. Note: The default branch is default. Use Examples Version General Help Translate Help
Handwriter Make writing easier! How to Download and install a handwriting font, or create a font from your handwriting. Use a word processor like Micr
Microsoft's Cascadia Code font customized to my liking. Also includes some simple batch patch and bake scripts to batch patch glyphs and bake font features into fonts!
hashids for Python 2.7 & 3 A python port of the JavaScript hashids implementation. It generates YouTube-like hashes from one or many numbers. Use hash
kamusi JSON and CSV data for swahili dictionary with over 16600+ words. This repo consists of data from swahili dictionary with about 16683 words toge
ParseAnyText A small package to parse strings. What is the work of it? Well It's a module to creates parser that helps to parse a text easily with les
Skype export archive to text converter This software utility extracts chat logs
utoken utoken is a multilingual tokenizer that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresse
awesome-slugify Python flexible slugify function PyPi: https://pypi.python.org/pypi/awesome-slugify Github: https://github.com/dimka665/awesome-slugif
PyNews 📰 Simple newsletter made with python Install dependencies This project has some dependencies (see requirements.txt) that are not included in t
MnSRank Maiden & Spell community player ranking based on tournament data. Why? 2021 just ended and this seemed like a cool idea. Elo doesn't work well
Description shortuuid is a simple python library that generates concise, unambiguous, URL-safe UUIDs. Often, one needs to use non-sequential IDs in pl
Etranslate is a free and unlimited python library for transiting your texts
This is a text adventure which I made in python 3. This is one of my first big projects so any feedback would be greatly appreciated.
一款高性能非法词(敏感词)检测组件,附带繁体简体互换,支持全角半角互换,获取拼音首字母,获取拼音字母,拼音模糊搜索等功能。
Q & A I am often asked questions about how to solve this or that problem, and I decided to post these questions and solutions here, in case it is also
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.
Python Slugify A Python slugify application that handles unicode. Overview Best attempt to create slugs from unicode strings while keeping it DRY. Not
price-parser is a small library for extracting price and currency from raw text strings.