Code-autocomplete, a code completion plugin for Python

Last update: Jan 07, 2023

Related tags

Text Data & NLP code-autocomplete

Overview

Code AutoComplete

code-autocomplete, a code completion plugin for Python.

code-autocomplete实现了Python代码行粒度和块粒度自动补全功能。

Guide

Feature
Install
Usage
Contact
Citation
Reference

Feature

Demo

http://42.193.145.218/product/short_text_sim/

Install

pip3 install -U code-autocomplete

git clone https://github.com/shibing624/code-autocomplete.git
cd code-autocomplete
python3 setup.py install

Usage

Code Completion

开源项目：code-autocomplete，可支持GPT2模型，通过如下命令调用：

from autocomplete.gpt2 import Infer
m = Infer(model_name="gpt2", model_dir="shibing624/code-autocomplete-gpt2-base", use_cuda=False)
i = m.predict('import torch.nn as')
print(i)

当然，你也可使用官方的huggingface/transformers调用：

Please use 'GPT2' related functions to load this model!

import os
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = GPT2Tokenizer.from_pretrained("shibing624/code-autocomplete-gpt2-base")
model = GPT2LMHeadModel.from_pretrained("shibing624/code-autocomplete-gpt2-base")
model.to(device)
prompts = [
    """from torch import nn
    class LSTM(Module):
        def __init__(self, *,
                     n_tokens: int,
                     embedding_size: int,
                     hidden_size: int,
                     n_layers: int):""",
    """import numpy as np
    import torch
    import torch.nn as""",
    "import java.util.ArrayList",
    "def factorial(n):",
]
for prompt in prompts:
    input_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors='pt').to(device)
    outputs = model.generate(input_ids=input_ids,
                             max_length=64 + len(prompt),
                             temperature=1.0,
                             top_k=50,
                             top_p=0.95,
                             repetition_penalty=1.0,
                             do_sample=True,
                             num_return_sequences=1,
                             length_penalty=2.0,
                             early_stopping=True)
    decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(decoded)
    print("=" * 20)

Contact

Issue(建议)：
邮件我：xuming: [email protected]
微信我：加我微信号：xuming624, 备注：个人名称-公司-NLP 进NLP交流群。

Citation

如果你在研究中使用了code-autocomplete，请按如下格式引用：

@misc{code-autocomplete,
  author = {Xu Ming},
  title = {code-autocomplete: Code AutoComplete with GPT model},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  url = {https://github.com/shibing624/code-autocomplete},
}

License

授权协议为 The Apache License 2.0，可免费用做商业用途。请在产品说明中附加code-autocomplete的链接和授权协议。

Contribute

项目代码还很粗糙，如果大家对代码有所改进，欢迎提交回本项目，在提交之前，注意以下两点：

在tests添加相应的单元测试
使用python setup.py test来运行所有单元测试，确保所有单测都是通过的

之后即可提交PR。

Reference

https://github.com/galois-autocompleter/galois-autocompleter

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

564 Jan 8, 2023

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

91 Sep 23, 2022

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

Ucto for Python This is a Python binding to the tokeniser Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task,

27 Dec 14, 2022

0.0.4(Mar 1, 2022)

Full Changelog: https://github.com/shibing624/code-autocomplete/compare/0.0.3...0.0.4
Source code(tar.gz)
Source code(zip)
source_code.zip(105.14 MB)
0.0.3(Feb 14, 2022)

python-source-code
Source code(tar.gz)
Source code(zip)
download.zip(41.44 MB)
0.0.1(Feb 11, 2022)

0.0.1 GPT2 code autocomplete.
Source code(tar.gz)
Source code(zip)

Code-autocomplete, a code completion plugin for Python

Related tags

Overview

Code AutoComplete

Feature

Demo

Install

Usage

Code Completion

Contact

Citation

License

Contribute

Reference

You might also like...

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

Repository of the Code to Chatbots, developed in Python

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

Converts python code into c++ by using OpenAI CODEX.

Releases(0.0.4)

0.0.4(Mar 1, 2022)

0.0.3(Feb 14, 2022)

0.0.1(Feb 11, 2022)

Owner

xuming

Grover is a model for Neural Fake News -- both generation and detectio

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

A text augmentation tool for named entity recognition.

Unlimited Call - Text Bombing Tool

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

Japanese Long-Unit-Word Tokenizer with RemBertTokenizerFast of Transformers

[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

This is an incredibly powerful calculator that is capable of many useful day-to-day functions.

Auto-researching tool generating word documents.

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

An easier way to build neural search on the cloud

Demo programs for the Talking Head Anime from a Single Image 2: More Expressive project.

SurvTRACE: Transformers for Survival Analysis with Competing Events

Repository for the paper: VoiceMe: Personalized voice generation in TTS

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

本插件是pcrjjc插件的重置版，可以独立于后端api运行