🌐 Translation microservice powered by AI

Last update: Nov 22, 2022

Related tags

Text Data & NLP translate

Overview

Dot Translate

🌐 A microservice for quick and local translation using A.I.

This service starts a local webserver used for neural machine translation.

🚀 Features

	Dot Translate
🔒	No tracking or telemetry data is collected from you
🆓	Always free
⚡️	Fast on low-compute machines
📝	Accurate and keeps your prompt meaningful
💻	Open-source and open for contributions

For inference, all models are ran on the CPU. Every model utilized in this service are 8-bit quantized, which results in decreased latency and storage costs.

🔧 Contributing

We accept all positive contributions that affects this repository and service as a whole; we accept trained .argosmodels files via pull request.

Language	Source -> Target	Target -> Source
🇳🇱	nl -> en	en -> nl

❤️ Acknowledgements

Argos Translate, which is built on OpenNMT, is widely used in this repository for translation.

📜 Licenses

Dot Translate is licensed under the MIT license.

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

1.1k Dec 27, 2022

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

409 Oct 28, 2022

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

211 Dec 28, 2022

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

7.5k Feb 17, 2021

Open Source Neural Machine Translation in PyTorch

OpenNMT-py: Open-Source Neural Machine Translation OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine trans

4.8k Feb 18, 2021

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

986 Feb 17, 2021

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

310 Feb 1, 2021

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

137 Feb 1, 2021

A deep learning-based translation library built on Huggingface transformers

DL Translate A deep learning-based translation library built on Huggingface transformers and Facebook's mBART-Large 💻 GitHub Repository 📚 Documentat

244 Dec 30, 2022

Comments

Great Project!

Hi!

This looks like an awesome project! LibreTranslate is great but I'm partial towards the minimalist style.

Feel free to pass relevant support requests my way, I'm normally pretty responsive on GitHub and the LibreTranslate Forum.

Best,

P.J.

opened by argosopentech 1
Todo: fallback when out-of-memory or kill

when someone toys with the api around too much, it can cause the server to go out of memory, killing the flask app. as a temporary fix, going to decrease batch size.

future reference code: https://gist.github.com/kevinxhan/6c0bbc68f2ea6b2f4a620e5413c98fb8

opened by johnpaulbin 1

Releases(v2.0)

v2.0(Dec 4, 2021)
Dot Translate 2.0 Release

Bug fixes and overall improvement:

No longer using databases (keeping it open)

Returning JSON instead of plain text (for unicode errors)

More polished

Startup Dot Translate in just 4 simple steps:

git clone https://github.com/dothq/translate.git cd translate/ sudo docker build -t translate . sudo docker run -d -p 3000:3000 translate
Source code(tar.gz)
Source code(zip)

v1.0(Nov 27, 2021)

Dot Translate 1.0 Release

Startup Dot Translate with just 4 simple commands:

git clone https://github.com/dothq/translate.git
cd translate/
sudo docker build -t translate .
sudo docker run -d -p 3000:3000 translate

Source code(tar.gz)
Source code(zip)

November-2021(Nov 25, 2021)

This release contains .argosmodel files officially released in November 2021.

This release contains: en-cy cy-en en-nl nl-en
Source code(tar.gz)
Source code(zip)
cy_en.argosmodel(60.16 MB)
en_cy.argosmodel(59.51 MB)
en_nl.argosmodel(60.22 MB)
nl_en.argosmodel(61.37 MB)

Owner

Dot HQ

🚀 Makers of the privacy-focused web browser, Dot.

GitHub Repository

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

pkuseg：一个多领域中文分词工具包 (English Version) pkuseg 是基于论文[Luo et. al, 2019]的工具包。其简单易用，支持细分领域分词，有效提升了分词准确度。目录主要亮点编译和安装各类分词工具包的性能对比使用方式论文引用作者常见问题及解答主要

6k Dec 29, 2022

DaCy: The State of the Art Danish NLP pipeline using SpaCy

DaCy: A SpaCy NLP Pipeline for Danish DaCy is a Danish preprocessing pipeline trained in SpaCy. At the time of writing it has achieved State-of-the-Ar

71 Jan 06, 2023

Sequence model architectures from scratch in PyTorch

This repository implements a variety of sequence model architectures from scratch in PyTorch. Effort has been put to make the code well structured so that it can serve as learning material. The train

11 Mar 28, 2022

[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021

Compact Transformers Preprint Link: Escaping the Big Data Paradigm with Compact Transformers By Ali Hassani[1]*, Steven Walton[1]*, Nikhil Shah[1], Ab

367 Dec 31, 2022

LeBenchmark: a reproducible framework for assessing SSL from speech

11 Nov 30, 2022

Python library for Serbian Natural language processing (NLP)

SrbAI - Python biblioteka za procesiranje srpskog jezika SrbAI je projekat prikupljanja algoritama i modela za procesiranje srpskog jezika u jedinstve

3 Nov 22, 2022

结巴中文分词

jieba “结巴”中文分词：做最好的 Python 中文分词组件 "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best Python Chinese word segmentation

29.8k Jan 02, 2023

New Modeling The Background CodeBase

Modeling the Background for Incremental Learning in Semantic Segmentation This is the updated official PyTorch implementation of our work: "Modeling t

9 Dec 28, 2022

Implementation of Multistream Transformers in Pytorch

Multistream Transformers Implementation of Multistream Transformers in Pytorch. This repository deviates slightly from the paper, where instead of usi

47 Jul 26, 2022

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

ThinkTwice ThinkTwice is a retriever-reader architecture for solving long-text machine reading comprehension. It is based on the paper: ThinkTwice: A

4 Aug 06, 2021

Common Voice Dataset explorer

Common Voice Dataset Explorer Common Voice Dataset is by Mozilla Made during huggingface finetuning week Usage pip install -r requirements.txt streaml

22 Nov 16, 2022

超轻量级bert的pytorch版本，大量中文注释，容易修改结构，持续更新

bert4pytorch 2021年8月27更新：感谢大家的star，最近有小伙伴反映了一些小的bug，我也注意到了，奈何这个月工作上实在太忙，更新不及时，大约会在9月中旬集中更新一个只需要pip一下就完全可用的版本，然后会新添加一些关键注释。再增加对抗训练的内容，更新一个完整的finetune

317 Dec 18, 2022

This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

Improving Transformer Models by Reordering their Sublayers This repository contains the code for running the character-level Sandwich Transformers fro

53 Sep 26, 2022

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

BERT is to NLP what AlexNet is to CV This is the official implementation of BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Iden

20 Nov 03, 2022

🌐 Translation microservice powered by AI

Related tags

Overview

Dot Translate

🚀 Features

🔧 Contributing

❤️ Acknowledgements

📜 Licenses

You might also like...

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Open Source Neural Machine Translation in PyTorch

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

A deep learning-based translation library built on Huggingface transformers

Comments

Great Project!

Todo: fallback when out-of-memory or kill

Releases(v2.0)

v2.0(Dec 4, 2021)

Dot Translate 2.0 Release

v1.0(Nov 27, 2021)

Dot Translate 1.0 Release

November-2021(Nov 25, 2021)

Owner

Dot HQ

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

DaCy: The State of the Art Danish NLP pipeline using SpaCy

Sequence model architectures from scratch in PyTorch

[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021

LeBenchmark: a reproducible framework for assessing SSL from speech

Python library for Serbian Natural language processing (NLP)

结巴中文分词

New Modeling The Background CodeBase

Implementation of Multistream Transformers in Pytorch

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

Common Voice Dataset explorer

超轻量级bert的pytorch版本，大量中文注释，容易修改结构，持续更新

This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

Code for PED: DETR For (Crowd) Pedestrian Detection

Code-autocomplete, a code completion plugin for Python

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Official PyTorch implementation of SegFormer