A GitHub action that suggests type annotations for Python using machine learning.

Overview

Typilus: Suggest Python Type Annotations

A GitHub action that suggests type annotations for Python using machine learning.

This action makes suggestions within each pull request as suggested edits. You can then directly apply these suggestions to your code or ignore them.

Sample Suggestion Sample Suggestion

What are Python type annotations? Introduced in Python 3.5, type hints (more traditionally called type annotations) allow users to annotate their code with the expected types. These annotations are optionally checked by external tools, such as mypy and pyright, to prevent type errors; they also facilitate code comprehension and navigation. The typing module provides the core types.

Why use machine learning? Given the dynamic nature of Python, type inference is challenging, especially over partial contexts. To tackle this challenge, we use a graph neural network model that predicts types by probabilistically reasoning over a program’s structure, names, and patterns. This allows us to make suggestions with only a partial context, at the cost of suggesting some false positives.

Install Action in your Repository

To use the GitHub action, create a workflow file. For example,

name: Typilus Type Annotation Suggestions

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
  pull_request:
    branches: [ master ]

jobs:
  suggest:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so that typilus can access it.
    - uses: actions/[email protected]
    - uses: typilus/[email protected]
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        MODEL_PATH: path/to/model.pkl.gz   # Optional: provide the path of a custom model instead of the pre-trained model.
        SUGGESTION_CONFIDENCE_THRESHOLD: 0.8   # Configure this to limit the confidence of suggestions on un-annotated locations. A float in [0, 1]. Default 0.8
        DISAGREEMENT_CONFIDENCE_THRESHOLD: 0.95  # Configure this to limit the confidence of suggestions on annotated locations.  A float in [0, 1]. Default 0.95

The action uses the GITHUB_TOKEN to retrieve the diff of the pull request and to post comments on the analyzed pull request.

Technical Details & Internals

This GitHub action is a reimplementation of the Graph2Class model of Allamanis et al. PLDI 2020 using the ptgnn library. Internally, it uses a Graph Neural Network to predict likely type annotations for Python code.

This action uses a pre-trained neural network that has been trained on a corpus of open-source repositories that use Python's type annotations. At this point we do not support online adaptation of the model to each project.

Training your own model

You may wish to train your own model and use it in this action. To do so, please follow the steps in ptgnn. Then provide a path to the model in your GitHub action configuration, through the MODEL_PATH environment variable.

Contributing

We welcome external contributions and ideas. Please look at the issues in the repository for ideas and improvements.

You might also like...
 30 Days Of Machine Learning Using Pytorch
30 Days Of Machine Learning Using Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

customer churn prediction prevention in telecom industry using machine learning and survival analysis

Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l

using Machine Learning Algorithm to classification AppleStore application

AppleStore-classification-with-Machine-learning-Algo- using Machine Learning Algorithm to classification AppleStore application. the first step : 1: p

CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

SmartSim Example Zoo This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning appl

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.
Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us

A machine learning web application for binary classification using streamlit
A machine learning web application for binary classification using streamlit

Machine Learning web App This is a machine learning web application for binary classification using streamlit options this application contains 3 clas

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning My

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Comments
  • IndexError: list index out of range

    IndexError: list index out of range

    Diff GET Status Code:  200
    Traceback (most recent call last):
      File "/usr/src/entrypoint.py", line 81, in <module>
        changed_files = get_changed_files(diff_rq.text)
      File "/usr/src/changeutils.py", line 38, in get_changed_files
        assert file_diff_lines[3].startswith("---")
    IndexError: list index out of range
    

    logs_302.zip

    opened by ZdenekM 1
  • Several small fixes

    Several small fixes

    Here are couple of things I noticed trying Typilus inference using GH Action:

    • gracefully handle patches that include a file renames (\wo any content modifications) by skipping such files
    • extractor stats reporting only processed files
    opened by bzz 0
  • Create a ptgnn-based Typilus model

    Create a ptgnn-based Typilus model

    Create and use the full Typilus model instead of graph2class.

    • [ ] Implement it in ptgnn
    • [ ] Use action cache to store intermediate result
    • [ ] Auto-update type space "once in a while"
    enhancement 
    opened by mallamanis 0
Releases(v0.9)
Implementation of different ML Algorithms from scratch, written in Python 3.x

Implementation of different ML Algorithms from scratch, written in Python 3.x

Gautam J 393 Nov 29, 2022
SIMD-accelerated bitwise hamming distance Python module for hexidecimal strings

hexhamming What does it do? This module performs a fast bitwise hamming distance of two hexadecimal strings. This looks like: DEADBEEF = 1101111010101

Michael Recachinas 12 Oct 14, 2022
We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

Salary-Prediction-with-Machine-Learning 1. Business Problem Can a machine learning project be implemented to estimate the salaries of baseball players

Ayşe Nur Türkaslan 9 Oct 14, 2022
This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you ask it.

Crypto-Currency-Predictor This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you

Hazim Arafa 6 Dec 04, 2022
Reggy - Regressions with arbitrarily complex regularization terms

reggy Regressions with arbitrarily complex regularization terms. Currently suppo

Kim 1 Jan 20, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort

Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort

2.3k Jan 04, 2023
This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform.

Zillow-Houses This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform. Pipeline is consists of 10

2 Jan 09, 2022
Kalman filter library

The kalman filter framework described here is an incredibly powerful tool for any optimization problem, but particularly for visual odometry, sensor fusion localization or SLAM.

comma.ai 276 Jan 01, 2023
Bodywork deploys machine learning projects developed in Python, to Kubernetes.

Bodywork deploys machine learning projects developed in Python, to Kubernetes. It helps you to: serve models as microservices execute batch jobs run r

Bodywork Machine Learning 409 Jan 01, 2023
This is a curated list of medical data for machine learning

Medical Data for Machine Learning This is a curated list of medical data for machine learning. This list is provided for informational purposes only,

Andrew L. Beam 5.4k Dec 26, 2022
MLR - Machine Learning Research

Machine Learning Research 1. Project Topic 1.1. Exsiting research Benmark: https://paperswithcode.com/sota ACL anthology for NLP papers: http://www.ac

Charles 69 Oct 20, 2022
Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm.

Naive-Bayes Spam Classificator Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm. Main goal is to code a

Viktoria Maksymiuk 1 Jun 27, 2022
Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)

FFT-accelerated Interpolation-based t-SNE (FIt-SNE) Introduction t-Stochastic Neighborhood Embedding (t-SNE) is a highly successful method for dimensi

Kluger Lab 547 Dec 21, 2022
This jupyter notebook project was completed by me and my friend using the dataset from Kaggle

ARM This jupyter notebook project was completed by me and my friend using the dataset from Kaggle. The world Happiness 2017, which ranks 155 countries

1 Jan 23, 2022
Napari sklearn decomposition

napari-sklearn-decomposition A simple plugin to use with napari This napari plug

1 Sep 01, 2022
moDel Agnostic Language for Exploration and eXplanation

moDel Agnostic Language for Exploration and eXplanation Overview Unverified black box model is the path to the failure. Opaqueness leads to distrust.

Model Oriented 1.2k Jan 04, 2023
flexible time-series processing & feature extraction

A corona statistics and information telegram bot.

PreDiCT.IDLab 206 Dec 28, 2022
Traingenerator 🧙 A web app to generate template code for machine learning ✨

Traingenerator 🧙 A web app to generate template code for machine learning ✨ 🎉 Traingenerator is now live! 🎉

Johannes Rieke 1.2k Jan 07, 2023
Meerkat provides fast and flexible data structures for working with complex machine learning datasets.

Meerkat makes it easier for ML practitioners to interact with high-dimensional, multi-modal data. It provides simple abstractions for data inspection, model evaluation and model training supported by

Robustness Gym 115 Dec 12, 2022