Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Overview

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

arXiv

Description

Convert offline handwritten mathematical expression to LaTeX sequence using bidirectionally trained transformer.

How to run

First, install dependencies

# clone project   
git clone https://github.com/Green-Wood/BTTR

# install project   
cd BTTR
conda create -y -n bttr python=3.7
conda activate bttr
conda install --yes -c pytorch pytorch=1.7.0 torchvision cudatoolkit=<your-cuda-version>
pip install -e .   

Next, navigate to any file and run it. It may take 6~7 hours to coverage on 4 gpus using ddp.

# module folder
cd BTTR

# train bttr model using 4 gpus and ddp
python train.py --config config.yaml  

For single gpu user, you may change the config.yaml file to

gpus: 1
# gpus: 4
# accelerator: ddp

Imports

This project is setup as a package which means you can now easily import any file into any other file like so:

from bttr.datamodule import CROHMEDatamodule
from bttr import LitBTTR
from pytorch_lightning import Trainer

# model
model = LitBTTR()

# data
dm = CROHMEDatamodule(test_year=test_year)

# train
trainer = Trainer()
trainer.fit(model, datamodule=dm)

# test using the best model!
trainer.test(datamodule=dm)

Note

Metrics used in validation is not accurate.

For more accurate metrics:

  1. use test.py to generate result.zip
  2. download and install crohmelib, lgeval, and tex2symlg tool.
  3. convert tex file to symLg file using tex2symlg command
  4. evaluate two folder using evaluate command

Citation

@article{zhao2021handwritten,
  title={Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer},
  author={Zhao, Wenqi and Gao, Liangcai and Yan, Zuoyu and Peng, Shuai and Du, Lin and Zhang, Ziyin},
  journal={arXiv preprint arXiv:2105.02412},
  year={2021}
}
Comments
  • can you provide predict.py code?

    can you provide predict.py code?

    Hi ~ @Green-Wood.

    I feel grateful mind for your help. I wanna get predict.py code that prints latex from an input image. If this code is provided, it will be very useful to others as well.

    Best regards.

    opened by ai-motive 17
  • val_exprate=0 and save checkpoint

    val_exprate=0 and save checkpoint

    hello!thanks for your time! When I transfer some code in decoder or use it directly,the val_exprate are always be 0.000,I don't know why. Another problem is,I noticed that this code don't have the function to save checkpoint or something.Can you give me some help?Thanks again!

    opened by Ashleyyyi 6
  • Val_exprate = 0

    Val_exprate = 0

    When I retrained the model according to the instruction, the val_exprate was always 0.00, did anyone encounter this problem, thank you! (I has not modified any codes) @Green-Wood

    opened by qingqianshuying 4
  • test.py error occurs

    test.py error occurs

    When I run test.py code, the following error occurs. Can i get some helps?

    in test.py code test_year = "2016" ckp_path = "pretrained model"

    GPU available: True, used: True
    TPU available: False, using: 0 TPU cores
    Load data from: /home/motive/PycharmProjects/BTTR/bttr/datamodule/../../data.zip
    Extract data from: 2016, with data size: 1147
    total  1147 batch data loaded
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
    Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.01s/it]ExpRate: 0.32258063554763794
    length of total file: 1147
    Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.52it/s]
    --------------------------------------------------------------------------------
    DATALOADER:0 TEST RESULTS
    {}
    --------------------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/motive/PycharmProjects/BTTR/test.py", line 17, in <module>
        trainer.test(model, datamodule=dm)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in test
        results = self._run(model)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 759, in _run
        self.post_dispatch()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 789, in post_dispatch
        self.accelerator.teardown()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu.py", line 51, in teardown
        self.lightning_module.cpu()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/utilities/device_dtype_mixin.py", line 141, in cpu
        return super().cpu()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in cpu
        return self._apply(lambda t: t.cpu())
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
        module._apply(fn)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in _apply
        setattr(this, key, [fn(cur_v) for cur_v in current_val])
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in <listcomp>
        setattr(this, key, [fn(cur_v) for cur_v in current_val])
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in <lambda>
        return self._apply(lambda t: t.cpu())
    AttributeError: 'tuple' object has no attribute 'cpu'
    
    opened by ai-motive 3
  • How long does BTTR take to train?

    How long does BTTR take to train?

    Hi, thank you for great repository!

    How long does it take to train for your experiment in the paper? I mean training on CROHME 2014/2016/2019 on four NVIDIA 1080Ti GPUs.

    Thanks,

    opened by RyosukeFukatani 2
  • can you provide transfer learning code?

    can you provide transfer learning code?

    Hi~ @Green-Wood

    I wanna apply trasnfer learning using pretrained model.

    but, LightningCLI() is wrapped and difficult to customize.

    Thanks & best regards.

    opened by ai-motive 1
  • How can it get pretrained model ?

    How can it get pretrained model ?

    Hi, I wanna test your BTTR model but, it need to training process which will take a lot of time. So, can you give me a pretrained model link?

    Best regards.

    opened by ai-motive 1
  • After adding new token in dictionary getting error .

    After adding new token in dictionary getting error .

    Hi , getting error after adding new token in dictionary.txt

    Error(s) in loading state_dict for LitBTTR: size mismatch for bttr.decoder.word_embed.0.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.bias: copying a param with shape torch.Size([113]) from checkpoint, the shape in current model is torch.Size([115]).

    Kindly help me out how can i fix this error.

    opened by shivankaraditi 0
  • About dataset

    About dataset

    Could you tell me how to generate the offline math expression image from inkml file? My experiment show that a large scale image could improve the result obviously,so I'd like to know if there is unified offline data for academic research.

    opened by lightflash7 0
  • predicting on gpu is slower

    predicting on gpu is slower

    Hi ,

    As this model is a bit slower compared to the existing state-of-the-art model on CPU. So I tried to make predictions on GPU and surprisingly it slower on Gpu compare to CPU as well.

    I am attaching a code snapshot here

    device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')

    model = LitBTTR.load_from_checkpoint('pretrained-2014.ckpt',map_location=device)

    img = Image.open(img_path) img = ToTensor()(img) img.to(device)

    t1 = time.time() hyp = model.beam_search(img) t2 = time.time()

    Kindly help me out here how i can reduce prediction time

    FYI - using GPU on aws g4dn.xlarge configuration machine

    opened by Suma3 1
  • how to use TensorBoard?

    how to use TensorBoard?

    hello i don't know how to add scalar to TensorBoard? I want to do this kind of topic, hoping to improve some ExpRate, but I don’t know much about lightning TensorBoard.

    opened by win5923 9
Releases(v2.0)
Owner
Wenqi Zhao
Student in Nanjing University
Wenqi Zhao
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

SpeechBrain 5.1k Jan 02, 2023
A PyTorch implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Caiyong Wang 14 Sep 20, 2022
Dynamic hair modeling from monocular videos using deep neural networks

Dynamic Hair Modeling The source code of the networks for our paper "Dynamic hair modeling from monocular videos using deep neural networks" (SIGGRAPH

53 Oct 18, 2022
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [BCNet, CVPR 2021] This is the official pytorch implementation of BCNet built on

Lei Ke 434 Dec 01, 2022
Roger Labbe 13k Dec 29, 2022
Python implementation of "Single Image Haze Removal Using Dark Channel Prior"

##Dependencies pillow(~2.6.0) Numpy(~1.9.0) If the scripts throw AttributeError: __float__, make sure your pillow has jpeg support e.g. try: $ sudo ap

Joyee Cheung 73 Dec 20, 2022
Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Face-Mesh Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices. It employs machine learning

Farnam Javadi 9 Dec 21, 2022
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

320 Nov 21, 2022
Summary of related papers on visual attention

This repo is built for paper: Attention Mechanisms in Computer Vision: A Survey paper Vision-Attention-Papers Channel attention Spatial attention Temp

MenghaoGuo 2.1k Dec 30, 2022
Code for "Learning Graph Cellular Automata"

Learning Graph Cellular Automata This code implements the experiments from the NeurIPS 2021 paper: "Learning Graph Cellular Automata" Daniele Grattaro

Daniele Grattarola 37 Oct 26, 2022
OpenFed: A Comprehensive and Versatile Open-Source Federated Learning Framework

OpenFed: A Comprehensive and Versatile Open-Source Federated Learning Framework Introduction OpenFed is a foundational library for federated learning

25 Dec 12, 2022
Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Jan 02, 2023
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes (CVPR 2021 Oral)

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Surfaces Official code release for NGLOD. For technical details, please refer t

659 Dec 27, 2022
Final project code: Implementing BicycleGAN, for CIS680 FA21 at University of Pennsylvania

680 Final Project: BicycleGAN Haoran Tang Instructions 1. Training To train the network, please run train.py. Change hyper-parameters and folder paths

Haoran Tang 0 Apr 22, 2022
IOT: Instance-wise Layer Reordering for Transformer Structures

Introduction This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wi

IOT 19 Nov 15, 2022
Code for `BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery`, Neurips 2021

This folder contains the code for 'Scalable Variational Approaches for Bayesian Causal Discovery'. Installation To install, use conda with conda env c

14 Sep 21, 2022
Generative vs Discriminative: Rethinking The Meta-Continual Learning (NeurIPS 2021)

Generative vs Discriminative: Rethinking The Meta-Continual Learning (NeurIPS 2021) In this repository we provide PyTorch implementations for GeMCL; a

4 Apr 15, 2022
This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

Divam Gupta 19 Dec 17, 2022