A proof-of-concept jupyter extension which converts english queries into relevant python code

Overview

Text2Code for Jupyter notebook

A proof-of-concept jupyter extension which converts english queries into relevant python code.

Blog post with more details:

Data analysis made easy: Text2Code for Jupyter notebook

Demo Video:

Text2Code for Jupyter notebook

Supported Operating Systems:

  • Ubuntu
  • macOS

Installation

NOTE: We have renamed the plugin from mopp to jupyter-text2code. Uninstall mopp before installing new jupyter-text2code version.

pip uninstall mopp

CPU-only install:

For Mac and other Ubuntu installations not having a nvidia GPU, we need to explicitly set an environment variable at time of install.

export JUPYTER_TEXT2CODE_MODE="cpu"

GPU install dependencies:

sudo apt-get install libopenblas-dev libomp-dev

Installation commands:

git clone https://github.com/deepklarity/jupyter-text2code.git
cd jupyter-text2code
pip install .
jupyter nbextension enable jupyter-text2code/main

Uninstallation:

pip uninstall jupyter-text2code

Usage Instructions:

  • Start Jupyter notebook server by running the following command: jupyter notebook
  • If you don't see Nbextensions tab in Jupyter notebook run the following command:jupyter contrib nbextension install --user
  • You can open the sample notebooks/ctds.ipynb notebook for testing
  • If installation happened successfully, then for the first time, Universal Sentence Encoder model will be downloaded from tensorflow_hub
  • Click on the Terminal Icon which appears on the menu (to activate the extension)
  • Type "help" to see a list of currently supported commands in the repo
  • Watch Demo video for some examples

Docker containers for jupyter-text2code

We have published CPU and GPU images to docker hub with all dependencies pre-installed.

Visit https://hub.docker.com/r/deepklarity/jupyter-text2code/ to download the images and usage instructions.
CPU image size: 1.51 GB
GPU image size: 2.56 GB

Model training:

Generate training data:

From a list of templates present at jupyter_text2code/jupyter_text2code_serverextension/data/ner_templates.csv, generate training data by running the following command:

cd scripts && python generate_training_data.py

This command will generate data for intent matching and NER(Named Entity Recognition).

Create intent index faiss

Use the generated data to create a intent-matcher using faiss.

cd scripts && python create_intent_index.py

Train NER model

cd scripts && python train_spacy_ner.py

Steps to add more intents:

  • Add more templates in ner_templates with a new intent_id
  • Generate training data. Modify generate_training_data.py if different generation techniques are needed or if introducing a new entity.
  • Train intent index
  • Train NER model
  • modify jupyter_text2code/jupyter_text2code_serverextension/__init__.py with new intent's condition and add actual code for the intent
  • Reinstall plugin by running: pip install .

TODO:

  • Publish Docker image
  • Refactor code and make it mode modular, remove duplicate code, etc
  • Add support for Windows
  • Add support for more commands
  • Improve intent detection and NER
  • Explore sentence Paraphrasing to generate higher-quality training data
  • Gather real-world variable names, library names as opposed to randomly generating them
  • Try NER with a transformer-based model
  • With enough data, train a language model to directly do English->code like GPT-3 does, instead of having separate stages in the pipeline
  • Create a survey to collect linguistic data
  • Add Speech2Code support

Authored By:

Owner
DeepKlarity
DeepKlarity
gjf: A tool for fixing invalid GeoJSON objects

gjf: A tool for fixing invalid GeoJSON objects The goal of this tool is to make it as easy as possible to fix invalid GeoJSON objects through Python o

Yazeed Almuqwishi 91 Dec 06, 2022
Histogram matching plugin for rasterio

rio-hist Histogram matching plugin for rasterio. Provides a CLI and python module for adjusting colors based on histogram matching in a variety of col

Mapbox 75 Sep 23, 2022
A public data repository for datasets created from TransLink GTFS data.

TransLink Spatial Data What: TransLink is the statutory public transit authority for the Metro Vancouver region. This GitHub repository is a collectio

Henry Tang 3 Jan 14, 2022
Python renderer for OpenStreetMap with custom icons intended to display as many map features as possible

Map Machine project consists of Python OpenStreetMap renderer: SVG map generation, SVG and PNG tile generation, Röntgen icon set: unique CC-BY 4.0 map

Sergey Vartanov 0 Dec 18, 2022
Manipulation and analysis of geometric objects

Shapely Manipulation and analysis of geometric objects in the Cartesian plane. Shapely is a BSD-licensed Python package for manipulation and analysis

3.1k Jan 03, 2023
Starlite-tile38 - Showcase using Tile38 via pyle38 in a Starlite application

Starlite-Tile38 Showcase using Tile38 via pyle38 in a Starlite application. Repo

Ben 8 Aug 07, 2022
A multi-page streamlit app for the geospatial community.

A multi-page streamlit app for the geospatial community.

Qiusheng Wu 522 Dec 30, 2022
Python module and script to interact with the Tractive GPS tracker.

pyTractive GPS Python module and script to interact with the Tractive GPS tracker. Requirements Python 3 geopy folium pandas pillow usage: main.py [-h

Dr. Usman Kayani 3 Nov 16, 2022
A Jupyter - Leaflet.js bridge

ipyleaflet A Jupyter / Leaflet bridge enabling interactive maps in the Jupyter notebook. Usage Selecting a basemap for a leaflet map: Loading a geojso

Jupyter Widgets 1.3k Dec 27, 2022
Map Ookla server locations as a Kernel Density Estimation (KDE) geographic map plot.

Ookla Server KDE Plotting This notebook was created to map Ookla server locations as a Kernel Density Estimation (KDE) geographic map plot. Currently,

Jonathan Lo 1 Feb 12, 2022
Extract GoPro highlights and GPMF data.

Python script that parses the gpmd stream for GOPRO moov track (MP4) and extract the GPS info into a GPX (and kml) file.

Chris Auron 2 May 13, 2022
ProjPicker (projection picker) is a Python module that allows the user to select all coordinate reference systems (CRSs)

ProjPicker ProjPicker (projection picker) is a Python module that allows the user to select all coordinate reference systems (CRSs) whose extent compl

Huidae Cho 4 Feb 06, 2022
Asynchronous Client for the worlds fastest in-memory geo-database Tile38

This is an asynchonous Python client for Tile38 that allows for fast and easy interaction with the worlds fastest in-memory geodatabase Tile38.

Ben 53 Dec 29, 2022
Script that allows to download data with satellite's orbit height and create CSV with their change in time.

Satellite orbit height ◾ Requirements Python = 3.8 Packages listen in reuirements.txt (run pip install -r requirements.txt) Account on Space Track ◾

Alicja Musiał 2 Jan 17, 2022
Geographic add-ons for Django REST Framework. Maintained by the OpenWISP Project.

Geographic add-ons for Django REST Framework. Maintained by the OpenWISP Project.

OpenWISP 982 Jan 06, 2023
Tool to display your current position and angle above your radar

🛠 Tool to display your current position and angle above your radar. As a response to the CS:GO Update on 1.2.2022, which makes cl_showpos a cheat-pro

Miko 6 Jan 04, 2023
Pure python WMS

Ogcserver Python WMS implementation using Mapnik. Depends Mapnik = 0.7.0 (and python bindings) Pillow PasteScript WebOb You will need to install Map

Mapnik 130 Dec 28, 2022
A set of utility functions for working with GeoJSON annotations in Kaibu

kaibu-utils A set of utility functions for working with Kaibu. Create a new repository Create a new repository and select imjoy-team/imjoy-python-temp

ImJoy Team 0 Dec 12, 2021
Obtain a GNSS position fix from an 11-millisecond raw GNSS signal snapshot

Obtain a GNSS position fix from an 11-millisecond raw GNSS signal snapshot without any prior knowledge about the position of the receiver and only coarse knowledge about the time.

Jonas Beuchert 2 Nov 17, 2022
A python package that extends Google Earth Engine.

A python package that extends Google Earth Engine GitHub: https://github.com/davemlz/eemont Documentation: https://eemont.readthedocs.io/ PyPI: https:

David Montero Loaiza 307 Jan 01, 2023