An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Tool that takes your photo and generates a pixelated color by number photo.

Color by number Tool that takes your photo and generates a pixelated color by number photo. Requirements You need to have python installed on your com

1 Dec 18, 2021
Python modules to work with large multiresolution images.

Large Image Python modules to work with large, multiresolution images. Large Image is developed and maintained by the Data & Analytics group at Kitwar

Girder 136 Jan 02, 2023
Herramienta Para Snipear Nitros Y Participar En Sorteos Automaticamente

Crips Nitro Sniper Discord Nitro Sniper Y Auto Participar En Sorteos ⚠️ Es Bastante Rapido Y Efectivo Hecho En Python Como Usar ( Python ) : python -m

1 Oct 27, 2021
Computer art based on joining transparent images

Computer Art There is no must in art because art is free. Introduction The following tutorial exaplains how to generate computer art based on a series

Computer Art 12 Jul 30, 2022
Magic-Square - Creates a magic square by randomly generating a list until the list happens to be a magic square

Magic-Square Creates a magic square by randomly generating a list until the list happens to be a magic square. Done as simply as possible... Frequentl

Nick 2 Jan 01, 2022
Fast batch image resizer and rotator for JPEG and PNG images.

imgp is a command line image resizer and rotator for JPEG and PNG images.

Terminator X 921 Dec 25, 2022
Docbarcodes extracts 1D and 2D barcodes from scanned PDF documents or images. It can be used to automate extraction and processing of all kind of documents.

Intro Barcodes are being used in many documents or forms to enable machine reading capabilities and reduce manual processing effort. Simple 1D barcode

Arlind Nocaj 3 Jun 18, 2022
Glyphtracer is an app for converting images of letters to a font

Glyphtracer takes an image that contains pictures of several letters. It recognizes all them and lets the user tag each letter to a Unicode code point. It then converts the images to vector form and

Jussi Pakkanen 38 Dec 24, 2022
Convert HDR photos taken by iPhone 12 (or later) to regular HDR images

heif-hdrgainmap-decode Convert HDR photos taken by iPhone 12 (or later) to regular HDR images. Installation First, make sure you have the following pa

Star Brilliant 5 Nov 13, 2022
Design custom QR codes with this web app!

My-QR.Art This web app lets users design their own QR codes to any domain. It can be acessed on my-qr.art. You can find some more background info abou

Marien Raat 406 Dec 20, 2022
Raven is a tool written in Python3 allowing you to generate an unique image with some text.

🐦 Raven is a tool written in Python3 allowing you to generate an unique image with some text. It does it by searching the text on Google, do

Billy 39 Dec 20, 2022
Python Image Morpher (PIM) is a program that can take two images and blend them to whatever extent or precision that you like

Python Image Morpher (PIM) is a program that can take two images and blend them to whatever extent or precision that you like! It is designed to emulate some of Python's OpenCV image processing from

David Dowd 108 Dec 19, 2022
A minimal, standalone viewer for 3D animations stored as stop-motion sequences of individual .obj mesh files.

ObjSequenceViewer V0.5 A minimal, standalone viewer for 3D animations stored as stop-motion sequences of individual .obj mesh files. Installation: pip

csmailis 2 Aug 04, 2022
Wand is a ctypes-based simple ImageMagick binding for Python

Wand Wand is a ctypes-based simple ImageMagick binding for Python, supporting 2.7, 3.3+, and PyPy. All functionalities of MagickWand API are implement

Eric McConville 1.2k Jan 03, 2023
Simplest QRGenerator with a cool feature (-sh=True :D)

Simple QR-Codes Generator :D Generates QR-codes, nothing more and nothing less . How to use Just run ./install.sh to set all the dependencies up, th

RENNAARENATA 1 Dec 11, 2021
Tweet2Image - Convert tweets to Instagram-friendly images.

Convert tweets to Instagram-friendly images. How to use If you want to use this repository as a submodule, don't forget to put the fonts d

Janu Lingeswaran 1 Mar 11, 2022
QR code python application which can read(decode) and generate(encode) QR codes.

QR Code Application This is a basic QR Code application. Using this application you can generate QR code for you text/links. Using this application yo

Atharva Parkhe 1 Aug 09, 2022
A procedural Blender pipeline for photorealistic training image generation

BlenderProc2 A procedural Blender pipeline for photorealistic rendering. Documentation | Tutorials | Examples | ArXiv paper | Workshop paper Features

DLR-RM 1.8k Jan 02, 2023
EmbedToolV2 - 2.0 Version of DraKenCodeZ/ImageEmbedTool

EmbedToolV2 - 2.0 Version of DraKenCodeZ/ImageEmbedTool

DraKenCodeZ 1 Dec 07, 2021
NFT collection generator. Generates layered images

NFT collection generator Generates layered images, whole collections. Provides additional functionality. Repository includes three scripts generate.py

Gleb Gonchar 10 Nov 15, 2022