Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Last update: Jan 03, 2023

Overview

EasyOCR

Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai.

What's new

1 February 2021 - Version 1.2.3
- Add setLanguageList method to Reader class. This is a convenient api for changing languages (within the same model) after creating class instance.
- Small change on text box merging. (thanks z-pc, see PR)
- Basic Demo on website
5 January 2021 - Version 1.2.2
- Add optimal_num_chars to detect method. If specified, bounding boxes with estimated number of characters near this value are returned first. (thanks @adamfrees)
- Add rotation_info to readtext method. Allow EasyOCR to rotate each text box and return the one with the best confident score. Eligible values are 90, 180 and 270. For example, try [90, 180 ,270] for all possible text orientations. (thanks @mijoo308)
- Update documentation.
17 November 2020 - Version 1.2
- New language supports for Telugu and Kannada. These are experimental lite recognition models. Their file sizes are only around 7% of other models and they are ~6x faster at inference with CPU.
12 October 2020 - Version 1.1.10
- Faster beamsearch decoder (thanks @amitbcp)
- Better code structure (thanks @susmith98)
- New language supports for Haryanvi(bgc), Sanskrit(sa) (Devanagari Script) and Manipuri(mni) (Bengari Script)
31 August 2020 - Version 1.1.9
- Add detect and recognize method for performing text detection and recognition separately
Read all released notes

What's coming next

Faster processing time
New language support

Examples

Supported Languages

We are currently supporting 80+ languages. See list of supported languages.

Installation

Install using pip for stable release,

pip install easyocr

For latest development release,

pip install git+git://github.com/jaidedai/easyocr.git

Note 1: for Windows, please install torch and torchvision first by following the official instruction here https://pytorch.org. On pytorch website, be sure to select the right CUDA version you have. If you intend to run on CPU mode only, select CUDA = None.

Note 2: We also provide Dockerfile here.

Try Third-Party Demos

Usage

import easyocr
reader = easyocr.Reader(['ch_sim','en']) # need to run only once to load model into memory
result = reader.readtext('chinese.jpg')

Output will be in list format, each item represents bounding box, text and confident level, respectively.

[([[189, 75], [469, 75], [469, 165], [189, 165]], '愚园路', 0.3754989504814148),
 ([[86, 80], [134, 80], [134, 128], [86, 128]], '西', 0.40452659130096436),
 ([[517, 81], [565, 81], [565, 123], [517, 123]], '东', 0.9989598989486694),
 ([[78, 126], [136, 126], [136, 156], [78, 156]], '315', 0.8125889301300049),
 ([[514, 126], [574, 126], [574, 156], [514, 156]], '309', 0.4971577227115631),
 ([[226, 170], [414, 170], [414, 220], [226, 220]], 'Yuyuan Rd.', 0.8261902332305908),
 ([[79, 173], [125, 173], [125, 213], [79, 213]], 'W', 0.9848111271858215),
 ([[529, 173], [569, 173], [569, 213], [529, 213]], 'E', 0.8405593633651733)]

Note 1: ['ch_sim','en'] is the list of languages you want to read. You can pass several languages at once but not all languages can be used together. English is compatible with every languages. Languages that share common characters are usually compatible with each other.

Note 2: Instead of filepath chinese.jpg, you can also pass OpenCV image object (numpy array) or image file as bytes. URL to raw image is also acceptable.

Note 3: The line reader = easyocr.Reader(['ch_sim','en']) is for loading model into memory. It takes some time but it need to be run only once.

You can also set detail = 0 for simpler output.

reader.readtext('chinese.jpg', detail = 0)

Result:

['愚园路', '西', '东', '315', '309', 'Yuyuan Rd.', 'W', 'E']

Model weight for chosen language will be automatically downloaded or you can download it manually from the following links and put it in '~/.EasyOCR/model' folder

In case you do not have GPU or your GPU has low memory, you can run it in CPU mode by adding gpu = False

reader = easyocr.Reader(['ch_sim','en'], gpu = False)

For more information, read tutorial and API Documentation.

Run on command line

$ easyocr -l ch_sim en -f chinese.jpg --detail=1 --gpu=True

Implementation Roadmap

Language packs: Expand support to more languages. We are aiming to cover > 80-90% of world's population. Also improve existing languages.
Better documentation and api
Language model for better decoding
Handwritten support: The key is using GAN to generate realistic handwritten dataset.
Faster processing time: model pruning (lite version) / quantization / export to other platforms (ONNX?)
Open Dataset and model training pipeline
Restructure code to support swappable detection and recognition algorithm. The api should be as easy as

reader = easyocr.Reader(['en'], detection='DB', recognition = 'CNN_Transformer')

The idea is to be able to plug-in any state-of-the-art model into EasyOCR. There are a lot of geniuses trying to make better detection/recognition model. We are not trying to be a genius here, just make genius's works quickly accessible to the public ... for free. (well I believe most geniuses want their work to create positive impact as fast/big as possible) The pipeline should be something like below diagram. Grey slots are placeholders for changeable light blue modules.

Acknowledgement and References

This project is based on researches/codes from several papers/open-source repositories.

All deep learning part is based on Pytorch. ❤️

Detection part is using CRAFT algorithm from this official repository and their paper (Thanks @YoungminBaek from @clovaai). We also use their pretrained model.

Recognition model is CRNN (paper). It is composed of 3 main components, feature extraction (we are currently using Resnet), sequence labeling (LSTM) and decoding (CTC). Training pipeline for recognition part is a modified version from deep-text-recognition-benchmark. (Thanks @ku21fan from @clovaai) This repository is a gem that deserved more recognition.

Beam search code is based on this repository and his blog. (Thanks @githubharald)

Data synthesis is based on TextRecognitionDataGenerator. (Thanks @Belval)

And good read about CTC from distill.pub here.

Want To Contribute?

Let's advance humanity together by making AI available to everyone!

3 ways to contribute:

Coder: Please send PR for small bug/improvement. For bigger one, discuss with us by open an issue first. There is a list of possible bug/improvement issue tagged with 'PR WELCOME'.

User: Tell us how EasyOCR benefit you/your organization to encourage further development. Also post failure cases in Issue Section to help improving future model.

Tech leader/Guru: If you found this library useful, please spread the word! (See Yann Lecun's post about EasyOCR)

Guideline for new language request

To request a new language support, I need you to send a PR with 2 following files

In folder easyocr/character, we need 'yourlanguagecode_char.txt' that contains list of all characters. Please see format example from other files in that folder.
In folder easyocr/dict, we need 'yourlanguagecode.txt' that contains list of words in your language. On average we have ~30000 words per language with more than 50000 words for popular one. More is better in this file.

If your language has unique elements (such as 1. Arabic: characters change form when attach to each other + write from right to left 2. Thai: Some characters need to be above the line and some below), please educate me with your best ability and/or give useful links. It is important to take care of the detail to achieve a system that really works.

Lastly, please understand that my priority will have to go to popular language or set of languages that share most of characters together (also tell me if your language share a lot of characters with other). It takes me at least a week to work for new model. You may have to wait a while for new model to be released.

See List of languages in development

Business Inquiries

For Enterprise Support, Jaided AI offers full service for custom OCR/AI systems from building, maintenance and deployment. Click here to contact us.

Comments

List of languages in development
I will update/edit this issue to track development process of new language. The current list is

Group 1 (Arabic script)

Arabic (DONE, August, 5 2020)

Uyghur (DONE, August, 5 2020)

Persian (DONE, August, 5 2020)

Urdu (DONE, August, 5 2020)

Group 2 (Latin script)

Serbian-latin (DONE, July,12 2020)

Occitan (DONE, July,12 2020)

Group 3 (Devanagari)

Hindi (DONE, July,24 2020)

Marathi (DONE, July,24 2020)

Nepali (DONE, July,24 2020)

Rajasthani (NEED HELP)

Awadhi, Haryanvi, Sanskrit (if possible)

Group 4 (Cyrillic script)

Russian (DONE, July,29 2020)

Serbian-cyrillic (DONE, July,29 2020)

Bulgarian (DONE, July,29 2020)

Ukranian (DONE, July,29 2020)

Mongolian (DONE, July,29 2020)

Belarusian (DONE, July,29 2020)

Tajik (DONE, April,20 2021)

Kyrgyz (NEED HELP)

Group 5

Telugu (DONE, November,17 2020)

Kannada (DONE, November,17 2020)

Group 6 (Language that doesn't share characters with others)

Tamil (DONE, August, 10 2020)

Hebrew (ready to train)

Malayalam (ready to train)

Bengali + Assamese (DONE, August, 23 2020)

Punjabi (ready to train)

Abkhaz (ready to train)

Group 7 (Improvement and possible extra models)

Japanese version 2 (DONE, March, 21 2021)+ vertical text

Chinese version2 (DONE, March, 21 2021)+ vertical text

Korean version 2(DONE, March, 21 2021)

Latin version 2 (DONE, March, 21 2021)

Math + Greek?

Number+symbol only

Guideline for new language request

To request a new language support, I need you to send a PR with 2 following files

In folder easyocr/character, we need 'yourlanguagecode_char.txt' that contains list of all characters. Please see format/example from other files in that folder.

In folder easyocr/dict, we need 'yourlanguagecode.txt' that contains list of words in your language. On average we have ~30000 words per language with more than 50000 words for popular one. More is better in this file.

If your language has unique elements (such as 1. Arabic: characters change form when attach to each other + write from right to left 2. Thai: Some characters need to be above the line and some below), please educate me with your best ability and/or give useful links. It is important to take care of the detail to achieve a system that really works.

Lastly, please understand that my priority will have to go to popular language or set of languages that share most of characters together (also tell me if your language share a lot of characters with other). It takes me at least a week to work for new model. You may have to wait a while for new model to be released.
help wanted Language Request
opened by rkcosmos 81
Error in easyocr.Reader with urlretrieve(model_url[model_file][0], MODEL_PATH)
Hello! Thanks for that amazing library first of all! Could someone please help to resolve the issue i encountered today only (yesterday and before it was working smoothly).

in my code i have let's say:

import easyocr reader = easyocr.Reader(['id', 'en'])

When i run it - i am getting the following error:

CUDA not available - defaulting to CPU. Note: This module is much faster with a GPU. MD5 hash mismatch, possible file corruption Re-downloading the recognition model, please wait Traceback (most recent call last): File "tryout_easyocr.py", line 5, in <module> reader = easyocr.Reader(['id', 'en']) File "/usr/local/lib/python3.7/site-packages/easyocr/easyocr.py", line 194, in __init__ urlretrieve(model_url[model_file][0], MODEL_PATH) File "/usr/local/Cellar/python/3.7.8/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 288, in urlretrieve % (read, size), result)

Regardless what language i choose - i face this error in all the environments:

in mac os runtime

in docker

in ubuntu

in colab https://colab.research.google.com/github/vistec-AI/colab/blob/master/easyocr.ipynb#scrollTo=lIYdn1woOS1n

Diving deeper it tries to download the following file: https://www.jaided.ai/read_download/latin.pth which i wasn't able to download with wget, curl or browser as well for the same issue.

Seems https://www.jaided.ai/ resets the connection during download
opened by z-aliakseyeu 21
Error in easyoce.Reader module

I am facing issue with easyocr.reader module. I have successfully imported easyocr, but face issue on following line.

reader = easyocr.Reader(['ch_sim', 'en'])

error is following.

AttributeError: module 'easyocr' has no attribute 'Reader'
help wanted

opened by Hassan1175 19

Model files won't download

Looks like there's an issue with the server for the model files. Returns a 403 forbidden when attempting to download.

    reader = easyocr.Reader(['en'], gpu = False)
  File "/Users/travis/.local/share/virtualenvs/ocr-RN8nrvRp/lib/python3.8/site-packages/easyocr/easyocr.py", line 185, in __init__
    urlretrieve(model_url['detector'][0] , DETECTOR_PATH)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

opened by exiva 16

Support for digit recognition

Hi,

Is the current version support digit recognition? If not, please add in future release. The OCR to recognize digits from meter is a common but will be very useful case. Sample cropped image of a meter screen-

Currently it is giving following result for above sample image-

opened by dsbyprateekg 13
Character Bounding Boxes

Hi, I am working on a project where I need the bounding boxes around the characters rather than the whole word. Is there any way I can do that using EasyOCR? Thanks

opened by Laveen-exe 12
Make server that users can experience easyocr in webpage!

Hi! I'm a student who work in common computer as intern. recently I'm working for deploying any useful and interesting project by using Ainize. and I ainize easyocr project so that every user can experience easyocr easy! Try click ainize button or visit webpage! you can try it by click ainize button or [web page] link in README.md thank you!

(Ainize is a project that can deploy server for free & able to access live services from Github with one click! if you interested in Ainize, try visit ainize.ai !)

Have a good day 😄!

opened by j1mmyson 12
[RFC] Improvements to the Japanese model
This project is impressive. I've tested this and it's extremely more precise than Tesseract, thank you for your effort.

There are a few issues I've noticed on some test cases I've tested, some I think are caused by missing symbols, others are more specific to the Japanese language, and could be improved from the context of a dictionary (looks like ja has only characters right now).

Is there anyway I can help you? I can certainly add the missing characters to the characters list, and I'm willing to also build a dictionary if that could help disambiguating some words. But I would have to wait for you to retrain the model on your side?

Here are my test cases:

1 - いや…あるというか…って→

https://0x0.st/ivNP.png

issues:

Missing … missing → mistakes って for つて (fixable with a dictionary file I think)

result:

([[208, 0], [628, 0], [628, 70], [208, 70]], 'あるというか:', 0.07428263872861862) ([[0, 1], [185, 1], [185, 69], [0, 69]], 'いや・', 0.2885110080242157) ([[3, 69], [183, 69], [183, 128], [3, 128]], 'つて', 0.4845466613769531)

2 - ♬〜（これが私の生きる道）

https://0x0.st/ivZC.png

issues:

Missing ♬〜 Mistakes （これ for にれ Mistakes が(ga) for か(ka) Detects English parens () instead of Japanese parens （）

([[1, 0], [125, 0], [125, 63], [1, 63]], ',~', 0.10811009258031845) ([[179, 0], [787, 0], [787, 66], [179, 66]], 'にれか私の生きる道)', 0.3134567439556122)

3 - （秀一）ああッ…もう⁉︎

https://0x0.st/ivZh.png

issues:

Mistakes small ッ for big ツ (similar to 1 but katakana instead of hiragana) Mistakes … for ・・・

([[0, 0], [174, 0], [174, 64], [0, 64]], '(秀一)', 0.9035432934761047) ([[207, 0], [457, 0], [457, 64], [207, 64]], 'ああツ・・・', 0.35586389899253845) ([[481, 0], [668, 0], [668, 64], [481, 64]], 'もう!?', 0.4920879304409027)

4 - そっか

https://0x0.st/ivZ7.png

issues:

mistakes そっか for そつか (fixable with a dictionary file, I think) (similar to 1 そっか is a really common word)

([[0, 0], [186, 0], [186, 60], [0, 60]], 'そつか', 0.9190227389335632)

5 - （久美子）うん　ヘアピンのお礼

https://0x0.st/ivZR.png

issues:

mistakes の for 0 (not sure how to fix this one, but it's pretty important – seems like の is properly recognized in my test case 2) mistakes ヘアピン for へアピソ (fixable by dictionary probably)

([[0, 0], [238, 0], [238, 72], [0, 72]], '(久美子)', 0.9745591878890991) ([[268, 0], [396, 0], [396, 70], [268, 70]], 'うん', 0.5724520087242126) ([[22, 60], [454, 60], [454, 132], [22, 132]], 'へアピソ0お礼', 0.25971919298171997)
Failure Cases
opened by pigoz 12
Any tutorial to train / fine-tune the model for more fonts (new dataset)? Any Update?

Thanks for publishing this great EASYOCR model! I am wondering if I can find a tutorial to train EASYOCR or finetune it on a custom dataset ( where I need to add a complex background for texts and support new fonts).

What do you think? is there any link for that?

Any update?

opened by hahmad2008 11

Does not recognize digit '7'

I've been trying to read some Sudokus and so far all digits have been recognized except 7. Is this a font issue maybe? All the images I have tried, no 7 was recognized.

Edit: Maybe related #130

Example:

Link to the image

 curl -o sudoku.png https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Sudoku-by-L2G-20050714.svg/1200px-Sudoku-by-L2G-20050714.svg.png
 easyocr -l en -f sudoku.png

([[42, 34], [100, 34], [100, 114], [42, 114]], '5', 0.9198901653289795)
([[174, 34], [234, 34], [234, 114], [174, 114]], '3', 0.5189616680145264)
([[704, 166], [762, 166], [762, 246], [704, 246]], '5', 0.8919380307197571)
([[44, 168], [104, 168], [104, 246], [44, 246]], '6', 0.8406975865364075)
([[568, 168], [630, 168], [630, 246], [568, 246]], '9', 0.7890614867210388)
([[445, 175], [499, 175], [499, 243], [445, 243]], '_', 0.8146840333938599)
([[176, 300], [234, 300], [234, 376], [176, 376]], '9', 0.8461765646934509)
([[968, 302], [1026, 302], [1026, 378], [968, 378]], '6', 0.9359095692634583)
([[44, 432], [102, 432], [102, 510], [44, 510]], '8', 0.9884797930717468)
([[1098, 432], [1156, 432], [1156, 510], [1098, 510]], '3', 0.5310271382331848)
([[572, 434], [630, 434], [630, 510], [572, 510]], '6', 0.9405879974365234)
([[46, 564], [108, 564], [108, 638], [46, 638]], '4', 0.6061235070228577)
([[702, 564], [762, 564], [762, 642], [702, 642]], '3', 0.617751955986023)
([[1100, 566], [1158, 566], [1158, 640], [1100, 640]], '_', 0.9189348220825195)
([[441, 569], [493, 569], [493, 637], [441, 637]], '8', 0.4214801788330078)
([[570, 694], [628, 694], [628, 772], [570, 772]], '2', 0.9453338384628296)
([[1098, 696], [1158, 696], [1158, 772], [1098, 772]], '6', 0.8925315737724304)
([[834, 826], [894, 826], [894, 906], [834, 906]], '2', 0.9729335308074951)
([[176, 830], [236, 830], [236, 906], [176, 906]], '6', 0.9366582036018372)
([[966, 830], [1024, 830], [1024, 904], [966, 904]], '8', 0.9889897704124451)
([[700, 956], [762, 956], [762, 1036], [700, 1036]], '9', 0.7338930368423462)
([[1098, 958], [1156, 958], [1156, 1036], [1098, 1036]], '5', 0.9352748394012451)
([[439, 961], [505, 961], [505, 1029], [439, 1029]], '4', 0.5053627490997314)
([[572, 962], [634, 962], [634, 1038], [572, 1038]], '_', 0.8581225872039795)
([[570, 1090], [630, 1090], [630, 1168], [570, 1168]], '8', 0.9891173839569092)
([[1096, 1090], [1154, 1090], [1154, 1166], [1096, 1166]], '9', 0.6340051293373108)

opened by lagerfeuer 11

pip install easyocr error

hi , I have a question, when i execute pip install easyocr ,tips : a error: could not find a version that satisfies the requirement thorchvision>=0.5(from easyocr)

opened by wangzaogen 11
AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute 'next'

Hi. I am currently trying to train my own model. I obtained the dataset as required and modified the config file. However, I get this error when trying to train. I have already tried to decrease workers in config file. I also tried to modify the dataset.py file, line 101 from image, text = data_loader_iter.next() to image, text = next(data_loader_iter)

However, the error persist.

Thanks

opened by proclaim5584 0

✨ Add: Merge to free

Hello! I am a developer who is working on various projects using EasyOCR.

I am sending a PR to suggest a function that would be helpful to many people after doing several experiments.

During detection, data with incorrectly configured coordinate vector values is free_list, The horizontal_list has been confirmed to go to data consisting of exact coordinate vector values.

However, this comes as horizontal_list + free_list when receiving the result, so if you need to check the data sequentially, you need to compare the pixels directly and see where they are recognized.

ezgif com-gif-maker

ezgif com-gif-maker (1)

The gif uploaded above is an image that shows that when recognized using EasyOCR directly, the data comes in sequentially and 2 and 4 come in last.

Untitled

This means that the free_list is not sorted at the end and is merged as it is.

Untitled1

This is difficult to see at a glance even when detail=0 is inserted into the detail parameter.

Untitled2

So I developed a function that makes it easier to see by aligning the horizontal_list and free_list together when free_merge is inserted into the output_format parameter.

ezgif com-gif-maker (2)

The gif uploaded above is an image that returns the result value sequentially after adding the function to EasyOCR.

result = reader.readtext(image, output_format='free_merge')
for r in result:
	print(r)

If you enter as above, you want it to be an EasyOCR that returns the results sequentially as follows.

([[90, 72], [162, 72], [162, 172], [90, 172]], '1', 0.9979992393730299)
([[299, 53], [387, 53], [387, 185], [299, 185]], '2', 1.0)
([[522, 44], [598, 44], [598, 172], [522, 172]], '3', 0.9944517558838641)
([[745, 53], [831, 53], [831, 169], [745, 169]], '4', 0.9790806838048891)
([[968, 52], [1042, 52], [1042, 174], [968, 174]], '5', 1.0)
([[1188, 54], [1266, 54], [1266, 172], [1188, 172]], '6', 0.9999949932161059)
([[1415, 77], [1475, 77], [1475, 169], [1415, 169]], '7', 0.9960819788421169)
([[1626, 54], [1706, 54], [1706, 174], [1626, 174]], '8', 1.0)
([[1848, 64], [1920, 64], [1920, 174], [1848, 174]], '9', 0.9999967813517721)
([[2027, 43], [2185, 43], [2185, 184], [2027, 184]], '10', 0.9999989884757856)
([[77.2879532910169, 230.02821146681296], [151.37898588609144, 239.7842872259749], [132.7120467089831, 373.971788533187], [58.62101411390856, 364.2157127740251]], '2', 0.9868415024810453)
([[281, 199], [391, 199], [391, 365], [281, 365]], '2', 0.9980995156509067)
([[526, 236], [574, 236], [574, 350], [526, 350]], '1', 0.3553702823128333)
([[738, 226], [836, 226], [836, 372], [738, 372]], '4', 1.0)
([[872, 282], [904, 282], [904, 358], [872, 358]], '1', 0.46445868490119224)
([[920.8651368309256, 237.1345809643125], [1041.2936593026684, 188.84521212806337], [1093.1348631690744, 349.86541903568747], [972.7063406973315, 398.15478787193666]], '4', 1.0)
([[1162, 224], [1266, 224], [1266, 384], [1162, 384]], '3', 0.9999594692522464)
([[1365, 213], [1497, 213], [1497, 407], [1365, 407]], '5', 0.9986469557185416)
([[1588, 200], [1714, 200], [1714, 394], [1588, 394]], '6', 0.3138604097965505)
([[1853, 255], [1893, 255], [1893, 365], [1853, 365]], '1', 0.9972939940858829)
([[2075, 239], [2117, 239], [2117, 339], [2075, 339]], '1', 0.9854363293399651)
([[47, 439], [199, 439], [199, 575], [47, 575]], '11', 0.9999839842351304)
([[264, 434], [422, 434], [422, 578], [264, 578]], '12', 0.9999954481433951)
([[489, 437], [639, 437], [639, 577], [489, 577]], '13', 0.9999845742882926)
([[709, 437], [865, 437], [865, 577], [709, 577]], '14', 0.9997981225645103)
([[929, 441], [1083, 441], [1083, 579], [929, 579]], '15', 0.9425667175676142)
([[1151, 445], [1303, 445], [1303, 579], [1151, 579]], '16', 0.9999962910793456)
([[1368, 445], [1516, 445], [1516, 579], [1368, 579]], '17', 0.999997049721879)
([[1589, 445], [1741, 445], [1741, 579], [1589, 579]], '18', 0.9999982298328214)
([[1809, 447], [1961, 447], [1961, 585], [1809, 585]], '19', 0.9999972183091315)
([[2031, 445], [2183, 445], [2183, 581], [2031, 581]], '20', 0.9999991570631338)
([[50, 630], [172, 630], [172, 794], [50, 794]], '4', 1.0)
([[260, 622], [428, 622], [428, 798], [260, 798]], '4', 0.9993658331357935)
([[494, 618], [598, 618], [598, 782], [494, 782]], '3', 0.9969145055207527)
([[719, 621], [831, 621], [831, 781], [719, 781]], '5', 0.999999880790714)
([[949, 623], [1041, 623], [1041, 773], [949, 773]], '2', 0.9640018726844447)
([[1173, 655], [1239, 655], [1239, 753], [1173, 753]], '1', 0.9843721660900542)
([[1405, 633], [1471, 633], [1471, 767], [1405, 767]], '1', 0.99952905955627)
([[1606, 628], [1704, 628], [1704, 784], [1606, 784]], '2', 0.9996682680632638)
([[2039, 623], [2151, 623], [2151, 801], [2039, 801]], '2', 0.31963881498015567)
([[43, 845], [196, 845], [196, 979], [43, 979]], '21', 0.9999989041821146)
([[264, 841], [416, 841], [416, 981], [264, 981]], '22', 0.9999998314126102)
([[487, 843], [635, 843], [635, 981], [487, 981]], '23', 0.9999978083645809)
([[707, 841], [863, 841], [863, 981], [707, 981]], '24', 0.9999994942378553)
([[928, 840], [1082, 840], [1082, 984], [928, 984]], '25', 0.9999996628252286)
([[1152, 848], [1300, 848], [1300, 976], [1152, 976]], '26', 0.9864728654305385)
([[1369, 843], [1517, 843], [1517, 981], [1369, 981]], '27', 0.6750208001814506)
([[1589, 847], [1741, 847], [1741, 983], [1589, 983]], '28', 0.9999988299663297)
([[1811, 849], [1961, 849], [1961, 987], [1811, 987]], '29', 0.9999996628252286)
([[2032, 852], [2180, 852], [2180, 980], [2032, 980]], '30', 0.9999972183091315)
([[47, 1021], [183, 1021], [183, 1193], [47, 1193]], '5', 0.9999997615814351)
([[275, 1033], [385, 1033], [385, 1191], [275, 1191]], '2', 0.9999992847443906)
([[488, 1028], [610, 1028], [610, 1198], [488, 1198]], '5', 0.999989390401371)
([[724, 1022], [820, 1022], [820, 1174], [724, 1174]], '3', 0.16538231022019545)
([[927, 1013], [1043, 1013], [1043, 1191], [927, 1191]], '3', 0.9998320411641579)
([[1812, 1030], [1986, 1030], [1986, 1180], [1812, 1180]], '4', 0.9999662640555904)
([[2025, 1031], [2163, 1031], [2163, 1173], [2025, 1173]], '4', 1.0)

Originally, I added this feature to use, but I'm sure it'll be a necessary feature for someone.

Thank you for reading the long article.

opened by Hitbee-dev 1

Cannot detect English alphabets when they are tilted

I tried to recognize number plates from images of cars. The OCR fails to detect when the number is tilted. What should be done to improve the detection accuracy?

opened by ganesh15220 0
How to restart training from the saved model iteration.pth?

I was using a custom dataset to train EasyOCR, which has seven segment display digits. The training worked well up until the 110000th iteration, however my runtime got disconnected in between and I now want to restart it from the saved checkpoint. So, could someone please explain or assist me ?

opened by Mani-mk-mk 3

Releases(v1.6.2)

v1.6.2(Sep 15, 2022)
15 September 2022 - Version 1.6.2

Add CPU support for DBnet detector

DBnet will only be compiled when users initialize EasyOCR with DBnet detector.

Source code(tar.gz)
Source code(zip)
v1.6.1(Sep 1, 2022)
1 September 2022 - Version 1.6.1

Fix DBNET path bug for Windows

Add new built-in model cyrillic_g2. This model is a new default for Cyrillic script.

Source code(tar.gz)
Source code(zip)
cyrillic_g2.zip(13.48 MB)
v1.6.0(Aug 24, 2022)
v1.6.0

24 August 2022 - Version 1.6.0

Restructure code to support alternative text detectors.

Add detector DBNET, see paper. It can be used by initializing like this reader = easyocr.Reader(['en'], detect_network = 'dbnet18').

Source code(tar.gz)
Source code(zip)
pretrained_ic15_res18.zip(49.55 MB)
pretrained_ic15_res50.zip(102.53 MB)
v1.5.0(Jun 2, 2022)
v1.5.0

2 June 2022 - Version 1.5.0

Add trainer for CRAFT detection model (thanks@gmuffiness, see PR)

Source code(tar.gz)
Source code(zip)
v1.4.2(Apr 9, 2022)
9 April 2022 - Version 1.4.2

Update dependencies (opencv and pillow issues)

Source code(tar.gz)
Source code(zip)
v1.4.1(Sep 11, 2021)
11 September 2021 - Version 1.4.1

Add trainer folder

Add readtextlang method (thanks@arkya-art, see PR)

Extend rotation_info argument to support all possible angle (thanksabde0103, see PR)

Source code(tar.gz)
Source code(zip)
v1.4(Jun 29, 2021)
29 June 2021 - Version 1.4

Instruction on training/using custom recognition model

Example dataset

Batched image inference for GPU (thanks @SamSamhuns, see PR)

Vertical text support (thanks @interactivetech). This is for rotated text, not to be confused with vertical Chinese or Japanese text. (see PR)

Output in dictionary format (thanks @A2va, see PR)

Source code(tar.gz)
Source code(zip)
custom_example.zip(13.38 MB)
en_sample.zip(5.34 MB)
v1.3.2(May 30, 2021)
30 May 2021 - Version 1.3.2

Faster greedy decoder (thanks @samayala22)

Fix bug when text box's aspect ratio is disproportional (thanks iQuartic for bug report)

Source code(tar.gz)
Source code(zip)
v1.3.1(Apr 24, 2021)
24 April 2021 - Version 1.3.1

Add support for PIL image (thanks @prays)

Add Tajik language (tjk)

Update argument setting for command line

Add x_ths and y_ths to control merging behavior when paragraph=True

Source code(tar.gz)
Source code(zip)
v1.3(Mar 21, 2021)
21 March 2021 - Version 1.3

Second-generation models: multiple times smaller size, multiple times faster inference, additional characters, comparable accuracy to the first generation models. EasyOCR will choose the latest model by default but you can also specify which model to use by passing recog_network argument when creating Reader instance. For example, reader = easyocr.Reader(['en','fr'], recog_network = 'latin_g1') will use the 1st generation Latin model.

List of all models: Model hub

Source code(tar.gz)
Source code(zip)
english_g2.zip(13.39 MB)
japanese_g2.zip(15.32 MB)
korean_g2.zip(14.21 MB)
latin_g2.zip(13.62 MB)
zh_sim_g2.zip(19.34 MB)
v1.2.5(Feb 23, 2021)
22 February 2021 - Version 1.2.5

Add dynamic quantization for faster CPU inference (it is enabled by default for CPU mode)

More sensible confident score

Source code(tar.gz)
Source code(zip)
v1.2.4(Feb 7, 2021)
7 February 2021 - Version 1.2.4

Faster CPU inference speed by using dynamic input shape (recognition rate increases by around 100% for images with a lot of text)

Source code(tar.gz)
Source code(zip)
1.2.3(Feb 1, 2021)
1 February 2021 - Version 1.2.3

Add setLanguageList method to Reader class. This is a convenient api for changing languages (within the same model) after creating class instance.

Small change on text box merging. (thanks z-pc, see PR)

Basic Demo on website

Source code(tar.gz)
Source code(zip)
1.2.2(Jan 5, 2021)
5 January 2021 - Version 1.2.2

Add optimal_num_chars to detect method. If specified, bounding boxes with estimated number of characters near this value are returned first. (thanks @adamfrees)

Add rotation_info to readtext method. Allow EasyOCR to rotate each text box and return the one with the best confident score. Eligible values are 90, 180 and 270. For example, try [90, 180 ,270] for all possible text orientations. (thanks @mijoo308)

Update documentation.

Source code(tar.gz)
Source code(zip)
v1.2(Nov 17, 2020)

New language supports for Telugu and Kannada. These are experimental lite recognition models. Their file sizes are only around 7% of other models and they are ~6x faster at inference with CPU.

This release is also a preparation for user-created models/architectures in the future.
Source code(tar.gz)
Source code(zip)
kannada.zip(13.45 MB)
telugu.zip(13.45 MB)
1.1.10(Oct 14, 2020)
12 October 2020 - Version 1.1.10

Faster beamsearch decoder (thanks @amitbcp)

Better code structure (thanks @susmith98)

New language supports for Haryanvi(bgc), Sanskrit(sa) (Devanagari Script) and Manipuri(mni) (Bengari Script)

31 August 2020 - Version 1.1.9

Add detect and recognize method for performing text detection and recognition separately

Source code(tar.gz)
Source code(zip)
v1.1.8(Aug 23, 2020)
23 August 2020 - Version 1.1.8

20 new language supports for Bengali, Assamese, Abaza, Adyghe, Kabardian, Avar, Dargwa, Ingush, Chechen, Lak, Lezgian, Tabassaran, Bihari, Maithili, Angika, Bhojpuri, Magahi, Nagpuri, Newari, Goan Konkani

Support RGBA input format

Add min_size argument for readtext: for filtering out small text box

Source code(tar.gz)
Source code(zip)
bengali.zip(190.59 MB)
v1.1.7(Aug 10, 2020)
New language support for Tamil

Temporary fix for memory leakage on CPU mode

Source code(tar.gz)
Source code(zip)
tamil.zip(190.52 MB)
pre-v1.1.6(Aug 4, 2020)

Pretrained model files
Source code(tar.gz)
Source code(zip)
arabic.zip(190.65 MB)
chinese.zip(200.10 MB)
chinese_sim.zip(202.52 MB)
craft_mlt_25k.zip(73.67 MB)
cyrillic.zip(190.61 MB)
devanagari.zip(190.62 MB)
japanese.zip(195.97 MB)
korean.zip(193.14 MB)
latin.zip(190.57 MB)
thai.zip(190.62 MB)
v1.1(Jun 29, 2020)

Stable version with added simplified Chinese support
Source code(tar.gz)
Source code(zip)

Owner

Jaided AI

Distribute the benefits of AI to the world

GitHub Repository https://www.jaided.ai/easyocr

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

83 Jan 04, 2023

一键翻译各类图片内文字

一键翻译各类图片内文字针对群内、各个图站上大量不太可能会有人去翻译的图片设计，让我这种日语小白能够勉强看懂图片主要支持日语，不过也能识别汉语和小写英文支持简单的涂白和嵌字

574 Dec 28, 2022

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

SEAM The implementation of Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentaion. You can also download the repos

459 Dec 26, 2022

DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

3.1k Jan 05, 2023

Face Anonymizer - FaceAnonApp v1.0

Face Anonymizer - FaceAnonApp v1.0 Blur faces from image and video files in /data/files folder. Contents Repo of the source files for the FaceAnonApp.

6 Apr 18, 2022

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

AVP-BEV-OPEN Please check our new work AVP_SLAM_SIM A pkg stiching around view images(4-6cameras) to generate bird's eye view! View Demo · Report Bug

37 Dec 01, 2022

Deep Learning Chinese Word Segment

引用本项目模型BiLSTM+CRF参考论文：http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文：https://arxiv.org/abs/1702.02098 构建安装好bazel代码构建工具，安装好tensorflow（目前本项目需

2.1k Dec 23, 2022

Library used to deskew a scanned document

Deskew //Note: Skew is measured in degrees. Deskewing is a process whereby skew is removed by rotating an image by the same amount as its skew but in

273 Jan 06, 2023

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update

671 Dec 27, 2022

零样本学习测评基准，中文版

ZeroCLUE 零样本学习测评基准，中文版零样本学习是AI识别方法之一。简单来说就是识别从未见过的数据类别，即训练的分类器不仅仅能够识别出训练集中已有的数据类别，还可以对于来自未见过的类别的数据进行区分。这是一个很有用的功能，使得计算机能够具有知识迁移的能力，并无需任何训练数据，很符合现

27 Dec 10, 2022

Some codes from PyImageSearch course's and external projects.

👨‍💻 Some codes and projects 👨‍💻 💡 Technologies 📜 Projects 📍 Chrome Dinosaur Controller 📦 Script 📍 Coins Counter 📦 Script 🤓 Author Lucas Biv

25 Oct 24, 2021

利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT，并参加百度第三届论文复现赛，将在2021年5月15日比赛完后提供AIStudio链接～敬请期待参考项目： CRAFT: Character-Region Awarenes

2 Mar 07, 2022

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

141 Dec 31, 2022

一款基于Qt与OpenCV的仿真数字示波器

4 Nov 02, 2022

🖺 OCR using tensorflow with attention

tensorflow-ocr 🖺 OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

646 Nov 11, 2022

An easy to use an (hopefully useful) captcha solution for pyTelegramBotAPI

pyTelegramBotCAPTCHA An easy to use and (hopefully useful) image CAPTCHA soltion for pyTelegramBotAPI. Installation: pip install pyTelegramBotCAPTCHA

29 Dec 26, 2022

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Image Thresholding is used for image segmentation. From a grayscale image, thresholding can be used to create binary images. In thresholding we pick a threshold T.

3 Feb 15, 2022

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

127 Dec 03, 2022

Face_mosaic - Mosaic blur processing is applied to multiple faces appearing in the video

動機 face_recognitionを使用して得られる顔座標は長方形であり、この座標をそのまま用いてぼかし処理を行った場合得られる画像は醜い。それに対してモ

6 Feb 03, 2022

⛓ marc is a small, but flexible Markov chain generator

About marc (markov chain) is a small, but flexible Markov chain generator. Usage marc is easy to use. To build a MarkovChain pass the object a sequenc

65 Oct 27, 2022

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Related tags

Overview

EasyOCR

What's new

What's coming next

Examples

Supported Languages

Installation

Try Third-Party Demos

Usage

Run on command line

Implementation Roadmap

Acknowledgement and References

Want To Contribute?

Guideline for new language request

Business Inquiries

Comments

1 - いや…あるというか…って→

issues:

2 - ♬〜（これが私の生きる道）

issues:

3 - （秀一）ああッ…もう⁉︎

issues:

4 - そっか

issues:

5 - （久美子）うん ヘアピンのお礼

issues:

Example:

Hello! I am a developer who is working on various projects using EasyOCR.

Originally, I added this feature to use, but I'm sure it'll be a necessary feature for someone.

Releases(v1.6.2)

v1.6.2(Sep 15, 2022)

v1.6.1(Sep 1, 2022)

v1.6.0(Aug 24, 2022)

v1.5.0(Jun 2, 2022)

v1.4.2(Apr 9, 2022)

v1.4.1(Sep 11, 2021)

v1.4(Jun 29, 2021)

v1.3.2(May 30, 2021)

v1.3.1(Apr 24, 2021)

v1.3(Mar 21, 2021)

v1.2.5(Feb 23, 2021)

v1.2.4(Feb 7, 2021)

1.2.3(Feb 1, 2021)

1.2.2(Jan 5, 2021)

v1.2(Nov 17, 2020)

1.1.10(Oct 14, 2020)

v1.1.8(Aug 23, 2020)

v1.1.7(Aug 10, 2020)

pre-v1.1.6(Aug 4, 2020)

v1.1(Jun 29, 2020)

Owner

Jaided AI

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

一键翻译各类图片内文字

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

Face Anonymizer - FaceAnonApp v1.0

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

Deep Learning Chinese Word Segment

Library used to deskew a scanned document

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

零样本学习测评基准，中文版

Some codes from PyImageSearch course's and external projects.

利用Paddle框架复现CRAFT

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

一款基于Qt与OpenCV的仿真数字示波器

🖺 OCR using tensorflow with attention

An easy to use an (hopefully useful) captcha solution for pyTelegramBotAPI

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Face_mosaic - Mosaic blur processing is applied to multiple faces appearing in the video

⛓ marc is a small, but flexible Markov chain generator

5 - （久美子）うん　ヘアピンのお礼