Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

Last update: Oct 22, 2022

Related tags

Text Data & NLP Yarub_library

Overview

Yarub_library

#The Problem

اللغة العربية تعد من اكثر اللغات انتشارا و استخداما و تتميز لغة الضاد بثراء رصيدها من الكلمات والصيغ ، وهي لغة متميزة من الناحية الصوتية ، فقد اشتملت على جميع الأصوات التي اشتملت عليها اللغات السامية الأخرى . كما تتميز بالمرونة حيث تستوعب جميع الألفاظ المشتقة والمترادفة وتضع لكل مقام مقال لها

ادركنا اهمية اللغة العربية و مكانتها بين شعوب الشرق الاوسط و العالم, و نسعى فى ادراج اللغة العربية ضمن اللغات التى يتيسر استخدامها فى تطبيقات الذكاء الاصطناعى و معالجة اللغات الطبيعية للبشر

In this Omdena project, our goal was to develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications like Morphological analysis, Named Entity Recognition, Sentiment Analysis, Word Embedding, Dialect Identification, Part of speech, and so on the training dataset. This article contains interesting code and could be beneficial for whatever your level of experience, but for beginners, it is a great start-up in data collection using web scraping with referral links to official documentation pages for every mentioned library.

Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

Related tags

Overview

Yarub_library

Owner

BADER ALABDAN

Py65 65816 - Add support for the 65C816 to py65

Local cross-platform machine translation GUI, based on CTranslate2

BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions

SGMC: Spectral Graph Matrix Completion

Official PyTorch implementation of SegFormer

A Paper List for Speech Translation

sangha, pronounced "suhng-guh", is a social networking, booking platform where students and teachers can share their practice.

Linking data between GBIF, Biodiverse, and Open Tree of Life

Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Code for lyric-section-to-comment generation based on huggingface transformers.

OpenChat: Opensource chatting framework for generative models

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Text to speech for Vietnamese, ez to use, ez to update

Repo for Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive Text Summarization

An extension for asreview implements a version of the tf-idf feature extractor that saves the matrix and the vocabulary.

COVID-19 Chatbot with Rasa 2.0: open source conversational AI

The RWKV Language Model

Python powered crossword generator with database with 20k+ polish words

CMeEE 数据集医学实体抽取