An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Image Processing - Make noise images clean

影像處理-影像降躁化(去躁化) (Image Processing - Make Noise Images Clean) 得力於電腦效能的大幅提升以及GPU的平行運算架構,讓我們能夠更快速且有效地訓練AI,並將AI技術應用於不同領域。本篇將帶給大家的是 「將深度學習應用於影像處理中的影像降躁化 」,

2 Aug 04, 2022
Extracts dominating colors from an image and presents them as a palette.

ColorPalette A simple web app to extract dominant colors from an image. Demo Live View it live at : https://colorpalettedemo.herokuapp.com/ You can de

Mayank Nader 214 Dec 29, 2022
AutoGiphyMovie lets you search giphy for gifs, converts them to videos, attach a soundtrack and stitches it all together into a movie!

AutoGiphyMovie lets you search giphy for gifs, converts them to videos, attach a soundtrack and stitches it all together into a movie!

Satya Mohapatra 18 Nov 13, 2022
Fast Image Retrieval is an open source image retrieval framework

Fast Image Retrieval is an open source image retrieval framework release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This framework implements most of the major binar

CISiP Lab 39 Nov 25, 2022
An esoteric visual language that takes image files as input based on a multi-tape turing machine, designed for compatibility with C.

vizh An esoteric visual language that takes image files as input based on a multi-tape turing machine, designed for compatibility with C. Overview Her

Sy Brand 228 Dec 17, 2022
Image2scan - a python program that can be applied on an image in order to get a scan of it back

image2scan Purpose image2scan is a python program that can be applied on an image in order to get a scan of it back. For this purpose, it searches for

Kushal Shingote 2 Feb 13, 2022
Script that organizes the Google Takeout archive into one big chronological folder

Script that organizes the Google Takeout archive into one big chronological folder

Mateusz Soszyński 1.6k Jan 09, 2023
Graphical tool to make photo collage posters

PhotoCollage Graphical tool to make photo collage posters PhotoCollage allows you to create photo collage posters. It assembles the input photographs

Adrien Vergé 350 Jan 02, 2023
Command line utility for converting images to seamless tiles

img2texture Command line utility for converting images to seamless tiles. The resulting tiles can be used as textures in games, compositing and 3D mod

Artёm IG 24 Dec 26, 2022
A Robust Avatar Generator with a huge number of templates

CoolAvatars Welcome to this repository of CoolAvatars. Using this project, you can generate cool avatars not only from the samples present in my image

RAVI PRAKASH 5 Oct 12, 2021
Qt based ebook reader

Qt based ebook reader Currently supports: pdf epub djvu fb2 mobi azw / azw3 / azw4 cbr / cbz md Contribute Paypal Bitcoin: 17jaxj26vFJNqQ2hEVerbBV5fpT

1.4k Dec 26, 2022
SGTL - Spectral Graph Theory Library

SGTL - Spectral Graph Theory Library SGTL is a python library of spectral graph theory methods. The library is still very new and so there are many fe

Peter Macgregor 6 Oct 01, 2022
Convert a DOS Punk image to text

DOS Punk Text Inspired by MAX CAPACITY's DOS Punks & the amazing DOS Punk community. DOS Punk Text is a Python 3 script that renders a DOS Punk image

4 Jan 13, 2022
Avatar Generator Python

This is a simple avatar generator project which uses your webcam to take pictures and saves five different types of your images into your device including the original image.

Faisal Ahmed 3 Jan 23, 2022
Png-to-stl - Converts PNG and text to SVG, and then extrudes that based on parameters

have ansible installed locally run ansible-playbook setup_application.yml this sets up directories, installs system packages, and sets up python envir

1 Jan 03, 2022
Water marker for images.

watermarker linux users: To fix this error,please add truetype font path File "watermark.py", line 58, in module font = ImageFont.truetype("Dro

13 Oct 27, 2022
Python-fu-cartoonify - GIMP plug-in to turn a photo into a cartoon.

python-fu-cartoonify GIMP plug-in to turn a photo into a cartoon. Preview Installation Copy python-fu-cartoonify.py into the plug-in folder listed und

Pascal Reitermann 6 Aug 05, 2022
㊙️ Create standard barcodes with Python. No external dependencies. 100% Organic Python.

python-barcode python-barcode provides a simple way to create barcodes in Python. There are no external dependencies when generating SVG files. Pillow

Hugo Barrera 419 Dec 26, 2022
A utility for quickly cropping large collections of images.

Crop Tool A utility for quickly cropping large collections of images. Inspired by Derrick Schultz's dataset-tools. Setup It's suggested that you use A

dusk (they/them) 6 Nov 14, 2021
A Blender add-on to create interesting meshes using symmetry

Procedural Symmetries This Blender add-on automates the process of iteratively applying a set of reflection planes to a base mesh. The result will con

1 Dec 29, 2021