DocuMiner
A production-ready pipeline for text mining and subject indexing
Want to Contribute?
More code and documentation coming soon.
Authors
Open Source Club
A production-ready pipeline for text mining and subject indexing
More code and documentation coming soon.
Open Source Club
telegram_bot_hashtags The bot creates hashtags for user's texts in Russian and English. It is a simple bot for creating hashtags. NOTE file config.py
PyNews 📰 Simple newsletter made with python Install dependencies This project has some dependencies (see requirements.txt) that are not included in t
The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.
hashids for Python 2.7 & 3 A python port of the JavaScript hashids implementation. It generates YouTube-like hashes from one or many numbers. Use hash
Vastasanuli Vastasanuli pelaa SANULI -peliä. Se ei aina voita. Käyttö Tarttet Pythonin (3.6+). Aja make (tai lataa words.txt muualta) Asentele vaaditt
An anthology of a variety of tools for the Persian language in Python
E-Book Converter Bot A bot that converts e-books to various formats, powered by calibre! It currently supports 34 input formats and 19 output formats.
Clickbait Detector An extension to detect if the articles content match its title. This was developed in a period of 24-hours in a hackathon called 'H
Memorize-New-Words In this very very very little project, I've wrote a code to memorize new english words. Therefore you can add the words and their m
An online markdown resume template project, based on pywebio
split Word file by chapter we use the mircosoft word api to code this tool api url:https://docs.microsoft.com/zh-cn/dotnet/api/ if this tool is good f
All of the documentation and the majority of the work done was by Christopher Jones ([emai
Covid-19-Formatter (Only for Germany and Austria) Dieses Script speichert die gemeldeten Daten des RKIs / BMSGPK und formatiert diese zu einer Asci Ta
Open-source linguistic ethnography tool for framing public opinion in mediatized groups. Table of Contents Installing Quickstart Links Installing Pyth
TextStatistics This program get a text file wich contains English text. The program analyses the text, and print some information. For this program I
ChirpText is a collection of text processing tools for Python 3. It is not meant to be a powerful tank like the popular NTLK but a small package which
Find frequency of letters appearing in 5-letter words in the English language In
You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.
Unicode Slugify Unicode Slugify is a slugifier that generates unicode slugs. It was originally used in the Firefox Add-ons web site to generate slugs
Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character