An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

BeeRef — A Simple Reference Image Viewer

BeeRef — A Simple Reference Image Viewer BeeRef lets you quickly arrange your reference images and view them while you create. Its minimal interface i

Rebecca 245 Dec 25, 2022
A drop-in replacement for django's ImageField that provides a flexible, intuitive and easily-extensible interface for quickly creating new images from the one assigned to the field.

django-versatileimagefield A drop-in replacement for django's ImageField that provides a flexible, intuitive and easily-extensible interface for creat

Jonathan Ellenberger 490 Dec 13, 2022
A simple image-level annotation tool supporting multi-channel images for napari.

napari-labelimg4classification A simple image-level annotation tool supporting multi-channel images for napari. This napari plugin was generated with

4 May 16, 2022
Conversion of Image, video, text into ASCII format

asciju Python package that converts image to ascii Free software: MIT license

Aju Tamang 11 Aug 22, 2022
Fast batch image resizer and rotator for JPEG and PNG images.

imgp is a command line image resizer and rotator for JPEG and PNG images.

Terminator X 921 Dec 25, 2022
Qrgenerator - A qr generator app using python3

qrgenerator by Mal4D Hi welcome into qr code generator using python by Mal4d Lin

Mal4D 1 Jan 09, 2022
Goddard Image Analysis and Navigation Tool

Copyright 2021 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. No copyright is clai

NASA 12 Dec 23, 2022
Img-to-ascii-art - Converter of image to ascii art

img-to-ascii-art Converter of image to ascii art Latest Features. Intoducing Col

1 Dec 31, 2021
Detecting haze image with hazer.

hazer-py Detecting haze image with hazer. What is hazer Hazer is a lib for getting "haze degree". This repository is python version of hazer: https://

Joey777210 2 Dec 27, 2021
ProsePainter combines direct digital painting with real-time guided machine-learning based image optimization.

ProsePainter Create images by painting with words. ProsePainter combines direct digital painting with real-time guided machine-learning based image op

Morphogen 276 Dec 17, 2022
An open source image editor which can manipulate an image in many ways!

Image Editor - An open source image editor which can manipulate an image in many ways! If you need any more modes in repo or I

TroJanzHEX 44 Nov 17, 2022
Plots is a graph plotting app for GNOME.

Plots is a graph plotting app for GNOME. Plots makes it easy to visualise mathematical formulae. In addition to basic arithmetic operations, it supports trigonometric, hyperbolic, exponential and log

Alex Huntley 138 Dec 14, 2022
Python Program that lets you write in your handwriting!

Handwriting with Python Python Program that lets you write in your handwriting! Inspired by: thaisribeiro.in How to run? Install Unidecode and Pillow

Amanda Rodrigues Vieira 2 Oct 25, 2021
A simple Streamlit Component to compare images in Streamlit apps. It integrates Knightlab's JuxtaposeJS

streamlit-image-juxtapose A simple Streamlit Component to compare images in Streamlit apps using Knightlab's JuxtaposeJS. The images are saved to the

Robin 30 Dec 31, 2022
Sombra is simple Raytracer written in pure Python.

Sombra Sombra is simple Raytracer written in pure Python. It's main purpose is to help understand how raytracing works with a clean code. If you are l

Hernaldo Jesus Henriquez Nuñez 10 Jul 16, 2022
Avatar Generator Python

This is a simple avatar generator project which uses your webcam to take pictures and saves five different types of your images into your device including the original image.

Faisal Ahmed 3 Jan 23, 2022
An API that renders HTML/CSS content to PNG using Chromium

html_png An API that renders HTML/CSS content to PNG using Chromium Disclaimer I am not responsible if you happen to make your own instance of this AP

10 Aug 08, 2022
Tweet2Image - Convert tweets to Instagram-friendly images.

Convert tweets to Instagram-friendly images. How to use If you want to use this repository as a submodule, don't forget to put the fonts d

Janu Lingeswaran 1 Mar 11, 2022
PIX is an image processing library in JAX, for JAX.

PIX PIX is an image processing library in JAX, for JAX. Overview JAX is a library resulting from the union of Autograd and XLA for high-performance ma

DeepMind 294 Jan 08, 2023
A ray tracing render implemented using Taichi language.

A ray tracing render implemented using Taichi language.

Mingrui Zhang 45 Oct 23, 2022