Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

Overview

auto_code_complete v1.3

purpose and usage

auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the model for this program is a combined model of a deep-learning NLP(Natural Language Process) model structure called 'GRU(gated recurrent unit)' and 'LSTM(Long Short Term Memory)'.

the model for this program is one of the deep-learning NLP(Natural Language Process) model structure called 'GRU(gated recurrent unit)'.

how to use (terminal)

  • first, download the repository on your local environment.
  • install the neccessary libraries on your dependent environment.

pip install -r requirements.txt

  • change your working directory to auto-complete/ and execute the line below

python -m auto_complete_model

  • it will require for you to enter the data you want to train with the model
ENTER THE CODE YOU WANT TO TRAIN IN YOUR MODEL : tensorflow tf.keras tf.keras.layers LSTM
==== TRAINING START ====
2022-01-08 18:24:14.308919: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
Epoch 1/100
3/3 [==============================] - 1s 59ms/step - loss: 4.7865 - acc: 0.0532
Epoch 2/100
3/3 [==============================] - 0s 62ms/step - loss: 3.9297 - acc: 0.2872
Epoch 3/100
3/3 [==============================] - 0s 58ms/step - loss: 2.9941 - acc: 0.5532
...
Epoch 31/100
3/3 [==============================] - 0s 75ms/step - loss: 0.2747 - acc: 0.8617
Epoch 32/100
3/3 [==============================] - 0s 65ms/step - loss: 0.2700 - acc: 0.8298
==== TRAINING DONE ====
Now, Load the best weights on your model.
  • if you input your dataset successfully, it will ask for any uncompleted word to be entered.
ENTER THE UNCOMPLETED CODE YOU WANT TO COMPLETE : t tf te l la li k ke tf.kera tf.keras.l
t  - best recommendation : tensorflow
		 - all recommendations :  ['tensorflow']
tf  - best recommendation : tf.keras
		 - all recommendations :  ['tfkeras', 'tf.keras']
te  - best recommendation : tensorflow
		 - all recommendations :  ['tensorflow']
l  - best recommendation : list
		 - all recommendations :  ['list', 'layers']
la  - best recommendation : lange
		 - all recommendations :  ['layers', 'lange']
li  - best recommendation : list
		 - all recommendations :  ['list']
k  - best recommendation : keras
		 - all recommendations :  ['keras']
ke  - best recommendation : keras
		 - all recommendations :  ['keras']
tf.kera  - best recommendation : tf.keras
		 - all recommendations :  []
tf.keras.l  - best recommendation : tf.keras.layers
		 - all recommendations :  ['tf.keras.layers']
  • it will return the best matched word to complete and other recommendations
Do you want to check only the recommendations? (y/n) : y
['tensorflow'], 
['tfkeras', 'tf.keras'], 
['tensorflow'], 
['list', 'layers'], 
['layers', 'lange'], 
['list'], 
['keras'], 
['keras'], 
[], 
['tf.keras.layers']

version update & issues

v1.2 update

2022.01.08

  • change deep-learning model from GRU to GRU+LSTM to improve the performance

By adding the same structrue of new LSTM layers to concatenate before the output layer to an existing model, it shows faster learning and better accuracies in predicting matched recommendations for given incomplete words.

v1.3.1 update

2022.01.09

  • fix the glitches in data preprocessing

We solved the problem that it wouldn't add a new dataset on an existing dataset.

  • add plot_history function in a model class

v1.3.2 update

2022.01.09

  • add model_save,model_load mode in order that users can save and load their model while training a customized model
# Load text data
tf_filepath = "../data/text_data/tf_all_symbols.txt"
with open(tf_filepath, 'r') as f:
    tf_code_text = f.read()

# split the data into 10 parts
total_length = len(tf_code_text)
tf_code_ls = []
for i in range(10):
    globals()[f'tf_code_text_{i}'] = tf_code_text[int(total_length*0.1)*i:int(total_length*0.1)]
    tf_code_ls.append(globals()[f'tf_code_text_{i}'])

# train each dataset with a model setting up arguments 'model_save=True, model_name='mymodel', model_load=True' 
for tf_code in tf_code_ls:
    my_model = auto_coding(new_code=tf_code,
                          # verbose=0,
                           batch_size=100,
                           epochs=200,
                           patience=12,
                           model_summary=True,
                           model_save=True,
                           model_name='tf_model', # 'tf_model/tf_model.h5'
                           model_load=True
                          )
Final Project Bootcamp Zero

The Quest (Pygame) Descripción Este es el repositorio de código The-Quest para el proyecto final Bootcamp Zero de KeepCoding. El juego consiste en la

Seven-z01 1 Mar 02, 2022
DeepPavlov Tutorials

DeepPavlov tutorials DeepPavlov: Sentence Classification with Word Embeddings DeepPavlov: Transfer Learning with BERT. Classification, Tagging, QA, Ze

Neural Networks and Deep Learning lab, MIPT 28 Sep 13, 2022
DLO8012: Natural Language Processing & CSL804: Computational Lab - II

NATURAL-LANGUAGE-PROCESSING-AND-COMPUTATIONAL-LAB-II DLO8012: NLP & CSL804: CL-II [SEMESTER VIII] Syllabus NLP - Reference Books THE WALL MEGA SATISH

AMEY THAKUR 7 Apr 28, 2022
Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Fine-tuning wav2vec2 for speaker recognition This is the code used to run the experiments in https://arxiv.org/abs/2109.15053. Detailed logs of each t

Nik 103 Dec 26, 2022
Voice Assistant inspired by Google Assistant, Cortana, Alexa, Siri, ...

author: @shival_gupta VoiceAI This program is an example of a simple virtual assitant It will listen to you and do accordingly It will begin with wish

Shival Gupta 1 Jan 06, 2022
hashily is a Python module that provides a variety of text decoding and encoding operations.

hashily is a python module that performs a variety of text decoding and encoding functions. It also various functions for encrypting and decrypting text using various ciphers.

DevMysT 5 Jul 17, 2022
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 20 Dec 12, 2022
topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour

edesz 1 Jan 03, 2022
A benchmark for evaluation and comparison of various NLP tasks in Persian language.

Persian NLP Benchmark The repository aims to track existing natural language processing models and evaluate their performance on well-known datasets.

Mofid AI 68 Dec 19, 2022
Search Git commits in natural language

NaLCoS - NAtural Language COmmit Search Search commit messages in your repository in natural language. NaLCoS (NAtural Language COmmit Search) is a co

Pushkar Patel 50 Mar 22, 2022
A highly sophisticated sequence-to-sequence model for code generation

CoderX A proof-of-concept AI system by Graham Neubig (June 30, 2021). About CoderX CoderX is a retrieval-based code generation AI system reminiscent o

Graham Neubig 39 Aug 03, 2021
A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

List Of English Words A text file containing over 466k English words. While searching for a list of english words (for an auto-complete tutorial) I fo

dwyl 8.5k Jan 03, 2023
We have built a Voice based Personal Assistant for people to access files hands free in their device using natural language processing.

Voice Based Personal Assistant We have built a Voice based Personal Assistant for people to access files hands free in their device using natural lang

Rushabh 2 Nov 13, 2021
pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Python binding for Morfologik Morfologik is Polish morphological analyzer. For more information see http://github.com/morfologik/morfologik-stemming/

Damian Mirecki 18 Dec 29, 2021
Edge-Augmented Graph Transformer

Edge-augmented Graph Transformer Introduction This is the official implementation of the Edge-augmented Graph Transformer (EGT) as described in https:

Md Shamim Hussain 21 Dec 14, 2022
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

KoGPT KoGPT (Korean Generative Pre-trained Transformer) https://github.com/kakaobrain/kogpt https://huggingface.co/kakaobrain/kogpt Model Descriptions

Kakao Brain 797 Dec 26, 2022
A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

wav2vec-toolkit A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models This repository accompanies the

Anton Lozhkov 29 Oct 23, 2022
Search with BERT vectors in Solr and Elasticsearch

Search with BERT vectors in Solr and Elasticsearch

Dmitry Kan 123 Dec 29, 2022
CorNet Correlation Networks for Extreme Multi-label Text Classification

CorNet Correlation Networks for Extreme Multi-label Text Classification Prerequisites python==3.6.3 pytorch==1.2.0 torchgpipe==0.0.5 click==7.0 ruamel

Guangxu Xun 38 Dec 31, 2022
Amazon Multilingual Counterfactual Dataset (AMCD)

Amazon Multilingual Counterfactual Dataset (AMCD)

35 Sep 20, 2022