auto_code_complete is a auto word-completetion program which allows you to customize it on your need

Overview

auto_code_complete v1.3

purpose and usage

auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the model for this program is a combined model of a deep-learning NLP(Natural Language Process) model structure called 'GRU(gated recurrent unit)' and 'LSTM(Long Short Term Memory)'.

the model for this program is one of the deep-learning NLP(Natural Language Process) model structure called 'GRU(gated recurrent unit)'.

data preprocessing

data-preprocess

model structure

model-structure

how to use (terminal)

auto-code1 auto-code2

  • first, download the repository on your local environment.
  • install the neccessary libraries on your dependent environment.

pip install -r requirements.txt

  • change your working directory to auto-complete/ and execute the line below

python -m auto_complete_model

  • it will require for you to enter the data you want to train with the model
ENTER THE CODE YOU WANT TO TRAIN IN YOUR MODEL : tensorflow tf.keras tf.keras.layers LSTM
==== TRAINING START ====
2022-01-08 18:24:14.308919: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
Epoch 1/100
3/3 [==============================] - 1s 59ms/step - loss: 4.7865 - acc: 0.0532
Epoch 2/100
3/3 [==============================] - 0s 62ms/step - loss: 3.9297 - acc: 0.2872
Epoch 3/100
3/3 [==============================] - 0s 58ms/step - loss: 2.9941 - acc: 0.5532
...
Epoch 31/100
3/3 [==============================] - 0s 75ms/step - loss: 0.2747 - acc: 0.8617
Epoch 32/100
3/3 [==============================] - 0s 65ms/step - loss: 0.2700 - acc: 0.8298
==== TRAINING DONE ====
Now, Load the best weights on your model.
  • if you input your dataset successfully, it will ask for any uncompleted word to be entered.
ENTER THE UNCOMPLETED CODE YOU WANT TO COMPLETE : t tf te l la li k ke tf.kera tf.keras.l
t  - best recommendation : tensorflow
		 - all recommendations :  ['tensorflow']
tf  - best recommendation : tf.keras
		 - all recommendations :  ['tfkeras', 'tf.keras']
te  - best recommendation : tensorflow
		 - all recommendations :  ['tensorflow']
l  - best recommendation : list
		 - all recommendations :  ['list', 'layers']
la  - best recommendation : lange
		 - all recommendations :  ['layers', 'lange']
li  - best recommendation : list
		 - all recommendations :  ['list']
k  - best recommendation : keras
		 - all recommendations :  ['keras']
ke  - best recommendation : keras
		 - all recommendations :  ['keras']
tf.kera  - best recommendation : tf.keras
		 - all recommendations :  []
tf.keras.l  - best recommendation : tf.keras.layers
		 - all recommendations :  ['tf.keras.layers']
  • it will return the best matched word to complete and other recommendations
Do you want to check only the recommendations? (y/n) : y
['tensorflow'], 
['tfkeras', 'tf.keras'], 
['tensorflow'], 
['list', 'layers'], 
['layers', 'lange'], 
['list'], 
['keras'], 
['keras'], 
[], 
['tf.keras.layers']

version update & issues

v1.2 update

2022.01.08

  • change deep-learning model from GRU to GRU+LSTM to improve the performance

By adding the same structrue of new LSTM layers to concatenate before the output layer to an existing model, it shows faster learning and better accuracies in predicting matched recommendations for given incomplete words.

v1.3.1 update

2022.01.09

  • fix the glitches in data preprocessing

We solved the problem that it wouldn't add a new dataset on an existing dataset.

  • add plot_history function in a model class

v1.3.2 update

2022.01.10

  • add model_save,model_load mode in order that users can save and load their model while training a customized model
  • add data_split mode so that the big data can be trained seperately.
samp_model = auto_coding(new_code=samp_text,
                      # verbose=0,
                       batch_size=100,
                       epochs=200,
                       patience=10,
                       model_summary=True,
                       model_save=True,
                       model_name='samp_test', # samp_test/samp_test.h5
                       model_load=True,
                       data_split=True,
                       data_split_num=3 # the number into which users want to split the data
                      )

v1.3.3 update

2022.01.11

  • add new metrics Accuracy for Recommendations to evaluate the model's instant performance when predicting the recommendation list for words.
t  - best match : tf
	 - all recommendations :  ['tensorflow', 'tf']
tup  - best match : tuple
	 - all recommendations :  []
p  - best match : pd
	 - all recommendations :  ['plt', 'pd', 'pandas']
li  - best match : list
	 - all recommendations :  []
d  - best match : dataset
	 - all recommendations :  ['dic', 'dataset']
I  - best match : Import
	 - all recommendations :  []
so  - best match : sort
	 - all recommendations :  ['sort']
m  - best match : matplotlib.pyplot
	 - all recommendations :  []
Accuracy for Best:  0.875
Accuracy for Recommendations :  1.0
Owner
RUO
AI, Data Science, ML, DL
RUO
TFIDF-based QA system for AIO2 competition

AIO2 TF-IDF Baseline This is a very simple question answering system, which is developed as a lightweight baseline for AIO2 competition. In the traini

Masatoshi Suzuki 4 Feb 19, 2022
An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

Welcome to AdaptNLP A high level framework and library for running, training, and deploying state-of-the-art Natural Language Processing (NLP) models

Novetta 407 Jan 03, 2023
A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

multitask-learning-transformers A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You

Shahrukh Khan 48 Jan 02, 2023
LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating

LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating (Dataset) The dataset is from Amazon Review Data (2018)

Immanuvel Prathap S 1 Jan 16, 2022
Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Visual Automata Copyright 2021 Lewi Lie Uberg Released under the MIT license Visual Automata is a Python 3 library built as a wrapper for Caleb Evans'

Lewi Uberg 55 Nov 17, 2022
MMDA - multimodal document analysis

MMDA - multimodal document analysis

AI2 75 Jan 04, 2023
Athena is an open-source implementation of end-to-end speech processing engine.

Athena is an open-source implementation of end-to-end speech processing engine. Our vision is to empower both industrial application and academic research on end-to-end models for speech processing.

Ke Technologies 34 Sep 08, 2022
Various Algorithms for Short Text Mining

Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te

Kwan-Yuet 466 Dec 06, 2022
Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks

wav2vec_finetune Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks Initial test: gender recognition on this dat

8 Aug 11, 2022
Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Anchored CorEx: Hierarchical Topic Modeling with Minimal Domain Knowledge Correlation Explanation (CorEx) is a topic model that yields rich topics tha

Greg Ver Steeg 592 Dec 18, 2022
Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Semantic Segmentation".

Dual Path Learning for Domain Adaptation of Semantic Segmentation Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Sema

27 Dec 22, 2022
Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

nwg-wrapper This program is a part of the nwg-shell project. This program is a GTK3-based wrapper to display a script output, or a text file content o

Piotr Miller 94 Dec 27, 2022
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Dedupe Python Library dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on

Dedupe.io 3.6k Jan 02, 2023
This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".

LipGAN Generate realistic talking faces for any human speech and face identity. [Paper] | [Project Page] | [Demonstration Video] Important Update: A n

Rudrabha Mukhopadhyay 438 Dec 31, 2022
Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

Eric Lam 31 Nov 07, 2022
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect

Intel Labs 2.9k Dec 31, 2022
Paddle2.x version AI-Writer

Paddle2.x 版本AI-Writer 用魔改 GPT 生成网文。Tuned GPT for novel generation.

yujun 74 Jan 04, 2023
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021
自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

ja-timex 自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器 概要 ja-timex は、現代日本語で書かれた自然文に含まれる時間情報表現を抽出しTIMEX3と呼ばれるアノテーション仕様に変換することで、プログラムが利用できるような形に規格化するルールベースの解析器です。

Yuki Okuda 116 Nov 09, 2022
Code associated with the Don't Stop Pretraining ACL 2020 paper

dont-stop-pretraining Code associated with the Don't Stop Pretraining ACL 2020 paper Citation @inproceedings{dontstoppretraining2020, author = {Suchi

AI2 449 Jan 04, 2023