An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Simplest QRGenerator with a cool feature (-sh=True :D)

Simple QR-Codes Generator :D Generates QR-codes, nothing more and nothing less . How to use Just run ./install.sh to set all the dependencies up, th

RENNAARENATA 1 Dec 11, 2021
Python Image Morpher (PIM) is a program that can take two images and blend them to whatever extent or precision that you like

Python Image Morpher (PIM) is a program that can take two images and blend them to whatever extent or precision that you like! It is designed to emulate some of Python's OpenCV image processing from

David Dowd 108 Dec 19, 2022
Nutrify - take a photo of food and learn about it

Nutrify - take a photo of food and learn about it Work in progress. To make this a thing, we're going to need lots of food images... Start uploading y

Daniel Bourke 93 Dec 30, 2022
Extracts random colours from an image

EXTRACT COLOURS This repo contains all the project files. Project Description A Program that extracts 10 random colours from an image and saves the rg

David .K. Danso 3 Dec 13, 2021
Rotates your images in the spirit of rot13

Image Rotator (imrot10) Its like rot13 but for images. Calling the algorithm imrot10 for im = image, rot = rotation, 10 = default magnitude. Unfortuna

Sarah 2 Dec 10, 2021
A functional and efficient python implementation of the 3D version of Maxwell's equations

py-maxwell-fdfd Solving Maxwell's equations via A python implementation of the 3D curl-curl E-field equations. This code contains additional work to e

Nathan Zhao 12 Dec 11, 2022
Python wrappers for external BART computational imaging tools and internal libraries

bartpy Python bindings for BART. Overview This repo contains code to generate an updated Python wrapper for the Berkeley Advance Reconstruction Toolbo

Max Litster 7 May 09, 2022
A tool and a library for SVG path data transformations.

SVG path data transformation toolkit A tool and a library for SVG path data transformations. Currently it supports a translation and a scaling. Usage

Igor Mikushkin 2 Mar 07, 2022
Combinatorial image generator for generative NFT art.

ImageGen Stitches multiple image layers together into one image. Run usage: stitch.py [-h] backgrounds_dir dinos_dir traits_dir texture_file

Dinosols NFT 19 Sep 16, 2022
A Toolbox for Image Feature Matching and Evaluations

This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.

Qunjie Zhou 342 Dec 29, 2022
Typesheet is a tiny Python script for creating transparent PNG spritesheets from TrueType (.ttf) fonts.

typesheet typesheet is a tiny Python script for creating transparent PNG spritesheets from TrueType (.ttf) fonts. I made it because I couldn't find an

Grayson Chao 12 Dec 23, 2022
Hello, this project is an example of how to generate a QR Code using python 😁

Hello, this project is an example of how to generate a QR Code using python 😁

Davi Antonaji 2 Oct 12, 2021
Create a random fluent image based on multiple colors.

FluentGenerator Create a random fluent image based on multiple colors. Navigation Example Install Update Usage In Python console FluentGenerator Fluen

1 Feb 02, 2022
Glyph-graph - A simple, yet versatile, package for graphing equations on a 2-dimensional text canvas

Glyth Graph Revision for 0.01 A simple, yet versatile, package for graphing equations on a 2-dimensional text canvas List of contents: Brief Introduct

Ivan 2 Oct 21, 2022
Bringing vtk.js into Dash and Python

Dash VTK Dash VTK lets you integrate the vtk.js visualization pipeline directly into your Dash app. It is powered by react-vtk-js. Docs Demo Explorer

Plotly 88 Nov 29, 2022
MyPaint is a simple drawing and painting program that works well with Wacom-style graphics tablets.

MyPaint A fast and dead-simple painting app for artists Features Infinite canvas Extremely configurable brushes Distraction-free fullscreen mode Exten

MyPaint 2.3k Jan 01, 2023
DP2 graph edit codes.

必要γͺγ‚½γƒ•γƒˆγƒ»γƒ‘γƒƒγ‚±γƒΌγ‚Έ Python3 Numpy JSON Matplotlib ε‹•δ½œη’Ίθͺη’°ε’ƒ MacBook Air M1 Python 3.8.2 (arm64) Numpy 1.22.0 Matplotlib 3.5.1 JSON 2.0.9 使い方 draw_time_histgram(

1 Feb 19, 2022
Nudity detection with Python

nude.py About Nudity detection with Python. Port of nude.js to Python. Installation from pip: $ pip install --upgrade nudepy from easy_install: $ eas

Hideo Hattori 881 Jan 06, 2023
An API that renders HTML/CSS content to PNG using Chromium

html_png An API that renders HTML/CSS content to PNG using Chromium Disclaimer I am not responsible if you happen to make your own instance of this AP

10 Aug 08, 2022
Fix datetime EXIF data in photos downloaded from Google Takeout

fix-google-takeout Warning Use at your own risk. Backup your photos. Overview Google takeout for photos

Mayank Mandava 20 Nov 05, 2022