Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Overview

gosseract OCR

Runtime Test codecov Go Report Card License: MIT Go Reference

Golang OCR package, by using Tesseract C++ library.

OCR Server

Do you just want OCR server, or see the working example of this package? Yes, there is already-made server application, which is seriously easy to deploy!

👉 https://github.com/otiai10/ocrserver

Example

package main

import (
	"fmt"
	"github.com/otiai10/gosseract/v2"
)

func main() {
	client := gosseract.NewClient()
	defer client.Close()
	client.SetImage("path/to/image.png")
	text, _ := client.Text()
	fmt.Println(text)
	// Hello, World!
}

Install

  1. tesseract-ocr, including library and headers
  2. go get -t github.com/otiai10/gosseract

Check Dockerfile for more detail of installation, or you can just try by docker run -it --rm otiai10/gosseract.

Test

In case you have tesseract-ocr on your local, you can just hit

% go test .

Otherwise, if you DON'T want to install tesseract-ocr on your local, kick ./test/runtime which is using Docker and Vagrant to test the source code on some runtimes.

% ./test/runtime --driver docker
% ./test/runtime --driver vagrant

Check ./test/runtimes for more information about runtime tests.

Issues

Comments
  • Installation Failure on Windows 7

    Installation Failure on Windows 7

    Summary

    Installation Failure on Windows 7 λ go get -t github.com/otiai10/gosseract

    github.com/otiai10/gosseract

    tessbridge.cpp:5:10: fatal error: tesseract/baseapi.h: No such file or directory #include <tesseract/baseapi.h> ^~~~~~~~~~~~~~~~~~~~~ compilation terminated.

    Reproducibility

    Yes

    Reproducility Frequency

    • 100%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    1. Install GO lan
    2. Install GCC (64 Bit Compiler)
    3. Install GIT
    4. Install Tesseract from this site https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.02-20180621.exe
    5. Execute λ go get -t github.com/otiai10/gosseract

    github.com/otiai10/gosseract

    tessbridge.cpp:5:10: fatal error: tesseract/baseapi.h: No such file or directory #include <tesseract/baseapi.h> ^~~~~~~~~~~~~~~~~~~~~ compilation terminated.

    Environment

    Windows 7

    uname -a
    
    go env
    

    C:\Users\33133 λ go env set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\33133\AppData\Local\go-build set GOEXE=.exe set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOOS=windows set GOPATH=C:\Users\33133\go set GORACE= set GOROOT=C:\Go set GOTMPDIR= set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64 set GCCGO=gccgo set CC=gcc set CXX=g++ set CGO_ENABLED=1 set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\33133\AppData\Local\Temp\go-build238513982=/tmp/go-build -gno-record-gcc-switches

    C:\Users\33133 λ

    go version
    

    λ go version go version go1.10.3 windows/amd64

    tesseract --version
    

    C:\Program Files (x86)\Tesseract-OCR>tesseract --version tesseract 3.05.02 leptonica-1.75.3 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0. 9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0

    C:\Program Files (x86)\Tesseract-OCR>

    opened by vsarin 27
  • 'tesseract/baseapi.h' file not found

    'tesseract/baseapi.h' file not found

    % go test ./...
    # github.com/otiai10/gosseract/tesseract
    tesseract/tess.cpp:1:10: fatal error: 'tesseract/baseapi.h' file not found
    FAIL    github.com/otiai10/gosseract [build failed]
    
    question 
    opened by otiai10 25
  • Keep printing a blank with no error

    Keep printing a blank with no error

    package main

    import ( "fmt" "github.com/otiai10/gosseract" )

    func main() { client := gosseract.NewClient() defer client.Close() client.SetImage("path/to/image.png") text, _ := client.Text() fmt.Println(text) // Hello, World! }

    I am using this code and run it in docker but still getting a blank without error

    need more information 
    opened by gradygabriel10 10
  • tessbridge.cpp:5:10: fatal error: leptonica/allheaders.h: No such file or directory

    tessbridge.cpp:5:10: fatal error: leptonica/allheaders.h: No such file or directory

    Summary

    Installation failed on win 10 x64 by go get -t github.com/otiai10/gosseract $ go get -t github.com/otiai10/gosseract

    github.com/otiai10/gosseract

    tessbridge.cpp:5:10: fatal error: leptonica/allheaders.h: No such file or directory #include <leptonica/allheaders.h> ^~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated.

    I don't know about header files,. How do I install them on windows? leptonica/allheaders.h This is header files?

    Reproducibility

    Reproducibility Frequency

    • 100%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    Install GO lan Install GCC (64 Bit Compiler) Install GIT Install Tesseract from this site https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.02-20180621.exe go get -t github.com/otiai10/gosseract

    Environment

    
    
    go env
    

    $ go env set GO111MODULE= set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\VULCAN\AppData\Local\go-build set GOENV=C:\Users\VULCAN\AppData\Roaming\go\env set GOEXE=.exe set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=C:\Go\go\bin set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=C:\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\VULCAN\AppData\Local\Temp\go-build610875910=/tmp/go-build -gno-record-gcc-switches

    go version
    

    $ go version go version go1.13.5 windows/amd64

    tesseract --version
    ```$ tesseract --version
    tesseract 3.05.02
     leptonica-1.75.3
      libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0
    
    
    opened by QQ3544291 10
  • Leaks /dev/ttysXXX file handles even when Close() is manually called

    Leaks /dev/ttysXXX file handles even when Close() is manually called

    Summary

    This is a real issue for me as I am capturing images via security camera and every frame runs through Tesseract in real time.

    While running in a loop over files, OpenCV video frame, or as a web service, gosseract is opening a new /dev/ttys002 (or /dev/pts/004) every time an image is parsed. This eventually leads to a situation of running out of allowed file handlers.

    I have attached the example Go projects that have the issue and a C++ version that does not.

    lsof screenshot

    Reproducibility

    Always

    Environment

    macOS 10.14, Tesseract 4.1.0 Ubuntu 19.01, Tesseract 4.0

    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/Users/tbruno/Library/Caches/go-build"
    GOENV="/Users/tbruno/Library/Application Support/go/env"
    GOEXE=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="darwin"
    GOPATH="/Users/tbruno/Projects/GolandProjects/go"
    GOPRIVATE=""
    GOPROXY="https://proxy.golang.org,direct"
    GOROOT="/usr/local/Cellar/go/1.13.4/libexec"
    GOSUMDB="sum.golang.org"
    GOTMPDIR=""
    GOTOOLDIR="/usr/local/Cellar/go/1.13.4/libexec/pkg/tool/darwin_amd64"
    GCCGO="gccgo"
    AR="ar"
    CC="clang"
    CXX="clang++"
    CGO_ENABLED="1"
    GOMOD=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/2r/vy9wb4w90snd6wwv06ts6rth0000gn/T/go-build449183315=/tmp/go-build -gno-record-gcc-switches -fno-common"
    
    go version go1.13.4 darwin/amd64
    
    tesseract 4.1.0
     leptonica-1.78.0
      libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1
     Found AVX512BW
     Found AVX512F
     Found AVX2
     Found AVX
     Found SSE
    

    Examples (I'm using the same jpg file, but this happens even if a new file is opened and also happens with SetImageFromBytes)

    func main() {
    	client := gosseract.NewClient()
    	for {
    		client.SetImage("/Users/tbruno/test.jpg")
    		text, _ := client.Text()
    		fmt.Println(text)
    	}
    	client.Close()
    }
    
    func main() {
    	for {
    		client := gosseract.NewClient()
    		client.SetImage("/Users/tbruno/test.jpg")
    		text, _ := client.Text()
    		fmt.Println(text)
    		client.Close()
    	}
    }
    

    TessTester-ClientInLoopGo.zip TessTester-SingleClientGo.zip TessApi-NoLeakCpp.zip

    bug 
    opened by tebruno99 10
  • Fix/add tessdata prefix

    Fix/add tessdata prefix

    Hi.

    I've added an ability to provide different TessdataPrefix directly from go code with default value equal to environment TESSDATA_PREFIX. Requesting for a review, thanks.

    Seems like my solution only works with latest tesseract and only on linux (different was not tested). We should somehow define default directory for models for different tesseract versions.

    opened by awskii 9
  • Init only when required (perfs)

    Init only when required (perfs)

    I benchmarked my app and seen that 90% of the CPU time is lost in "init()".

    With this code I keep the instance open and perform multiple recognition on it, if a configuration change requires to init again, I flag the instance to rerun init

    What do you think about it?

    Details

    This is a typical use of gosseract to extract text, in a sample program (profiling included):

    package main
    
    import (
        "bytes"
        "image/png"
    
        "gocv.io/x/gocv"
        "github.com/openrm/gosseract"
        "github.com/pkg/profile"
    )
    
    func GetTextFromImage(img *gocv.Mat, client *gosseract.Client) (string, error) {
        buf := new(bytes.Buffer)
        finalImage, err := img.ToImage()
        png.Encode(buf, finalImage)
    
        client.SetImageFromBytes(buf.Bytes())
        client.SetPageSegMode(gosseract.PSM_SINGLE_BLOCK)
    
        out, err := client.Text()
    
        if err != nil {
          return "", err
        }
    
        return out, nil
    }
    
    func main() {
        defer profile.Start().Stop()
    
        client := gosseract.NewClient()
        defer client.Close()
    
        client.Languages = []string{"jpn"}
    
        img := gocv.IMRead("1.png", gocv.IMReadColor)
    
        for i := 0; i < 20; i++ {
            GetTextFromImage(&img, client)
        }
    }
    

    With the code above, I get the following result with go profiling: result1

    As you can see, over the 12 seconds spent in the program, 11 are caused by repeated calls to init.

    With the proposed changes in this PR, the profiling is now like this: result2

    Notes

    • SetConfigFile and SetLanguage cause the program to init again
    • SetWhitelist, SetBlacklist, DisabledOutput and SetVariable make internal call to setVariablesToInitializedAPI if init has already been called
    opened by PuKoren 9
  • cannot find package

    cannot find package "github.com/otiai10/gosseract/v2"

    Hello, I have installed the package using go get github.com/otiai10/gosseract and imported it in my package: "github.com/otiai10/gosseract/v2" as per instructions.

    Summary

    I get this compile time error:

    vendor/app/shared/spamcheck/spamcheck.go:12:2: cannot find package "github.com/otiai10/gosseract/v2" in any of: /home/me/go/src/myapp/vendor/github.com/otiai10/gosseract/v2 (vendor tree) /usr/local/go/src/github.com/otiai10/gosseract/v2 (from $GOROOT) /home/me/go/src/github.com/otiai10/gosseract/v2 (from $GOPATH)

    Environment

    Ubuntu 18.08

    uname -a
    

    Linux pc5 5.3.0-28-generic #30~18.04.1-Ubuntu SMP Fri Jan 17 06:14:09 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

    go env
    

    GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/me/.cache/go-build" GOENV="/home/me/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/me/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build702738196=/tmp/go-build -gno-record-gcc-switches"

    go version
    

    go1.13.6 linux/amd64

    tesseract --version
    

    tesseract 4.0.0-beta.1 leptonica-1.75.3 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0

    Found AVX2 Found AVX Found SSE

    Appreciate your help to fix this.

    opened by themrkumar 8
  • macOS compile freebsd binary file failed

    macOS compile freebsd binary file failed

    Summary

    I'm using macOS 10.14.6 and going to compile binary file for FreeBSD 11.3, and build failed, show message: undefined: gosseract.NewClient.

    Reproducibility

    Reproducibility Frequency

    100%

    Environment

    Darwin Kernel Version 18.7.0
    
    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/Users/frankb/Library/Caches/go-build"
    GOENV="/Users/frankb/Library/Application Support/go/env"
    GOEXE=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="darwin"
    GOPATH="/Volumes/home/Development files/Go files/"
    GOPRIVATE=""
    GOPROXY="https://proxy.golang.org,direct"
    GOROOT="/usr/local/go"
    GOSUMDB="sum.golang.org"
    GOTMPDIR=""
    GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
    GCCGO="gccgo"
    AR="ar"
    CC="clang"
    CXX="clang++"
    CGO_ENABLED="1"
    GOMOD=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/0b/ntd9fcqx6xn6nt3gpv4yndm40000gn/T/go-build712170545=/tmp/go-build -gno-record-gcc-switches -fno-common"
    
    go version go1.13 darwin/amd64
    
    tesseract 4.1.0
     leptonica-1.78.0
      libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1
     Found AVX2
     Found AVX
     Found SSE
    

    Source

    package main
    
    import (
    	"fmt"
    	"github.com/otiai10/gosseract"
    )
    
    func main() {
    	client := gosseract.NewClient()
    	defer client.Close()
    	client.SetLanguage("deu");
    	client.SetImage("test.png")
    	text, _ := client.Text()
    	fmt.Println(text)
    }
    

    Compile command

    env GOOS=freebsd GOARCH=amd64 go build ocrtest.go
    

    Error Message

    # command-line-arguments
    ./ocrtest.go:9:12: undefined: gosseract.NewClient
    
    opened by frankble 8
  • macOS complie linux binary file failed

    macOS complie linux binary file failed

    Summary

    I'm using macOS and going to compile binary file for centos7.1, and build failed, show message: undefined: gosseract.NewClient. Thanks.

    Reproducibility

    Reproducility Frequency

    100%

    Environment

    uname -a
    

    Darwin AllenChen-MacBookPro.local 18.2.0 Darwin Kernel Version 18.2.0: Fri Oct 5 19:41:49 PDT 2018; root:xnu-4903.221.2~2/RELEASE_X86_64 x86_64

    go env
    

    GOARCH="amd64" GOBIN="/Users/allen/Documents/go/bin" GOCACHE="/Users/allen/Library/Caches/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/allen/Documents/go" GOPROXY="" GORACE="" GOROOT="/usr/local/go" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/s6/mqbnh2yn52b5glrvhk53jyxh0000gn/T/go-build032873897=/tmp/go-build -gno-record-gcc-switches -fno-common"

    go version
    

    go version go1.11 darwin/amd64

    tesseract --version
    

    tesseract 4.0.0 leptonica-1.76.0 libjpeg 9c : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 Found AVX Found SSE

    opened by czh0318 8
  • tesseract/baseapi.h: No such file or directory

    tesseract/baseapi.h: No such file or directory

    I'd like to use tesseract with go on Windows 7.

    During the installation process, as stated in the docs I execute

    c:\go\src\proj>go get github.com/otiai10/gosseract
    # github.com/otiai10/gosseract/tesseract
    C:\go\src\github.com\otiai10\gosseract\tesseract\tess.cpp:1:31: fatal error: tesseract/baseapi.h: No such file or directory
     #include <tesseract/baseapi.h>
                                   ^
    compilation terminated.
    

    And by searching the file system for the header file baseapi.h, I cannot find it.

    How can I solve this? Thank you

    question 
    opened by tobiassoltermann 8
  • gosseract finds no text where tesseract does

    gosseract finds no text where tesseract does

    Summary

    I am running tesseract and gosseract on the same image, a single line of text. Tesseract finds the text, gosseract does not.

    Reproducibility

    Reproducibility Frequency

    • 100%
    1. Run tesseract d2.pbm - --psm 13 and it will show the output
    2. Run go run main.go and it will not show any output

    go.mod:

    module gosstest
    
    go 1.19
    
    require github.com/otiai10/gosseract/v2 v2.4.0
    

    main.go:

    package main
    
    import (
    	"fmt"
    	"os"
    
    	"github.com/otiai10/gosseract/v2"
    )
    
    func main() {
    	const (
    		want     = "BPJAZGAP"
    		filename = "d2.pbm"
    	)
    	buf, err := os.ReadFile("d2.pbm")
    	if err != nil {
    		fmt.Fprintf(os.Stderr, "error reading %q: %v\n", filename, err)
    	}
    
    	fmt.Fprintln(os.Stderr, gosseract.Version())
    
    	ocr := gosseract.NewClient()
    	defer ocr.Close()
    	ocr.SetPageSegMode(gosseract.PSM_RAW_LINE) // --psm 13
    	ocr.SetImageFromBytes(buf)
    	got, err := ocr.Text()
    	if err != nil {
    		fmt.Fprintf(os.Stderr, "%v\n", err)
    	}
    
    	if want != got {
    		fmt.Fprintf(os.Stderr, "want %q but got %q", want, got)
    	}
    
    	fmt.Println(got)
    }
    

    d2.pbm:

    P1 42 8
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 1 1 1 0 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 0 1 1 1 0 0 0
    0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0
    0 1 1 1 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0
    0 1 0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 0 1 1 1 0 0 0
    0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0
    0 1 1 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 0 1 0 1 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    

    Environment

    Linux chieftec 6.0.15-300.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Dec 21 18:33:23 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
    
    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/home/jot/.cache/go-build"
    GOENV="/home/jot/.config/go/env"
    GOEXE=""
    GOEXPERIMENT=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="linux"
    GOINSECURE=""
    GOMODCACHE="/home/jot/go/pkg/mod"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="linux"
    GOPATH="/home/jot/go"
    [project.zip](https://github.com/otiai10/gosseract/files/10347142/project.zip)
    
    GOPRIVATE=""
    GOPROXY="direct"
    GOROOT="/usr/lib/golang"
    GOSUMDB="off"
    GOTMPDIR=""
    GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_amd64"
    GOVCS=""
    GOVERSION="go1.19.4"
    GCCGO="gccgo"
    GOAMD64="v1"
    AR="ar"
    CC="gcc"
    CXX="g++"
    CGO_ENABLED="1"
    GOMOD="/home/jot/work/gosseract/go.mod"
    GOWORK=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1563688176=/tmp/go-build -gno-record-gcc-switches"
    
    go version go1.19.4 linux/amd64
    
    tesseract 5.2.0
     leptonica-1.82.0
      libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.1.3) : libpng 1.6.37 : libtiff 4.4.0 : zlib 1.2.12 : libwebp 1.2.4
     Found AVX2
     Found AVX
     Found FMA
     Found SSE4.1
    
    opened by jhinrichsen 1
  • CI on Windows

    CI on Windows

    • https://github.com/otiai10/gosseract/issues/251
    • https://github.com/otiai10/gosseract/issues/200
    • https://github.com/otiai10/gosseract/issues/240
    • https://github.com/otiai10/gosseract/issues/199
    • https://github.com/otiai10/gosseract/issues/132
    • https://github.com/otiai10/gosseract/issues/234
    • https://github.com/otiai10/gosseract/issues/215
    • https://github.com/otiai10/gosseract/issues/233
    • https://github.com/otiai10/gosseract/issues/223
    • https://github.com/otiai10/gosseract/issues/226
    • and more
    opened by otiai10 0
  • Win11 compiler error

    Win11 compiler error

    This text is generated based on ISSUE_TEMPLATE.md. The issue reporter must read and remove this block before submitting.

    Summary

    Go Compilation Error (in tessbridge.cpp:5): fatal error: leptonica/allheaders.h: No such file or directory

    Reproducibility

    Reproducibility Frequency

    • XX%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    1. foo bar
    2. spam ham
    3. hoge fuga

    Environment

    uname -a
    

    Windows 11

    go env
    

    set GO111MODULE=on set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\Administrator\AppData\Local\go-build set GOENV=C:\Users\Administrator\AppData\Roaming\go\env set GOEXE=.exe set GOEXPERIMENT= set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=E:\Go\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=E:\Go set GOPRIVATE= set GOPROXY=https://goproxy.cn,direct set GOROOT=D:\Program Files\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=D:\Program Files\Go\pkg\tool\windows_amd64 set GOVCS= set GOVERSION=go1.18.3 set GCCGO=gccgo set GOAMD64=v1 set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD=E:\Go\src\ERMS\go.mod set GOWORK= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\ADMINI~1\AppData\Local\Temp\go-build2580212505=/tmp/go-build -gno-record-gcc-switches

    go version
    

    go version go1.18.3 windows/amd64

    tesseract --version
    

    tesseract v5.1.0.20220510 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 Found AVX512BW Found AVX512F Found AVX2 Found AVX Found FMA Found SSE4.1 Found libarchive 3.5.0 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5 libzstd/1.4.5 Found libcurl/7.77.0-DEV Schannel zlib/1.2.11 zstd/1.4.5 libidn2/2.0.4 nghttp2/1.31.0

    opened by zzdboy 0
  • add a finalizer to close the client

    add a finalizer to close the client

    If the developer forgets to call the close method after creating the client, it will cause a memory leak.

    To avoid this, I refer to the method in os.File. By adding a finalizer, the Close method will be called when the client is unreachable and the developer haven't call the Close method neither.

    Test

    client.go

    // NewClient construct new Client. It's due to caller to Close this client.
    func NewClient() *Client {
    	client := &Client{
    		api:        C.Create(),
    		Variables:  map[SettableVariable]string{},
    		Trim:       true,
    		shouldInit: true,
    		Languages:  []string{"eng"},
    	}
    	// set a finalizer to close the client when it's unused and not closed by the user
    	runtime.SetFinalizer(client, (*Client).Close)
    	return client
    }
    
    // Close frees allocated API. This MUST be called for ANY client constructed by "NewClient" function.
    func (client *Client) Close() (err error) {
    	// defer func() {
    	// 	if e := recover(); e != nil {
    	// 		err = fmt.Errorf("%v", e)
    	// 	}
    	// }()
    	fmt.Println("Closed")
    	C.Clear(client.api)
    	C.Free(client.api)
    	if client.pixImage != nil {
    		C.DestroyPixImage(client.pixImage)
    		client.pixImage = nil
    	}
    	// no need for a finalizer anymore
    	runtime.SetFinalizer(client, nil)
    	return err
    }
    

    test code

    func main() {
    	runGgosseract()
    	runtime.GC() // run a garbage collection
    	time.Sleep(2 * time.Second)
    	// see "Close" before "exit"
    	fmt.Println("exit")
    }
    
    func runGgosseract() {
    	client := gosseract.NewClient()
    	client.SetImage("path/to/image.png")
    	text, _ := client.Text()
    	fmt.Println(text)
    }
    
    opened by yin1999 1
  • Fix the docker build to download project source files

    Fix the docker build to download project source files

    The gosseract source files were not being downloaded during the Docker build process so the go test step was failing. Setting the environment variable fixes the issue and allows correct building of the docker image.

    opened by mrisher23 1
  • failed to initialize TessBaseAPI with code -1:

    failed to initialize TessBaseAPI with code -1:

    This text is generated based on ISSUE_TEMPLATE.md. The issue reporter must read and remove this block before submitting.

    Summary

    • I install gosseract at win10 . use MSYS2 with Mingw64 to install tesseract , leptonica module。 and finally install gosseract use command 'go get -t go get -t github.com/otiai10/gosseract/v2 '

    when i build my test project success , eventually throw a exception : `failed to initialize TessBaseAPI with code -1: '

    then I go to the install directory at go path , run the go test , also get the same error : all_test.go:144 Expected to be <nil> But actual failed to initialize TessBaseAPI with code -1:

    I can do nothing , becasue there are haven't any message with that code . pls help ! 3q !

    Reproducibility

    Reproducibility Frequency

    • 100%

    Reproducible Dockerfile

    FROM your-os:your-version
    # Describe how to reproduce your problem
    # on your environment
    

    Otherwise, describe how to reproduce

    Environment

    win10

    uname -a

    
    

    go env

    go env

    set GO111MODULE=auto set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\langxli\AppData\Local\go-build set GOENV=C:\Users\langxli\AppData\Roaming\go\env set GOEXE=.exe set GOEXPERIMENT= set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=C:\Users\langxli\go\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=C:\Users\langxli\go;D:\work\brick\brick_app_project set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=C:\Program Files\Go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64 set GOVCS= set GOVERSION=go1.17 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD=C:\Users\langxli\go\pkg\mod\github.com\otiai10\gosseract\[email protected]\go.mod set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=D:\msys64\tmp\go-build2794601340=/tmp/go-build -gno-record-gcc-switches

    go version
    # go version
    go version go1.17 windows/amd64
    
    
    

    tesseract --version

    tesseract --version

    tesseract 4.1.1 leptonica-1.81.1 libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.0.6) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0

    opened by langxlm 2
Releases(v2.3.1)
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

Minghui Liao 930 Jan 04, 2023
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR data

Arthur Flôr 160 Dec 21, 2022
Awesome anomaly detection in medical images

A curated list of awesome anomaly detection works in medical imaging, inspired by the other awesome-* initiatives.

Kang Zhou 57 Dec 19, 2022
learn how to use Gesture Control to change the volume of a computer

Volume-Control-using-gesture In this project we are going to learn how to use Gesture Control to change the volume of a computer. We first look into h

Diwas Pandey 49 Sep 22, 2022
A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

Roinand Aguila 109 Dec 12, 2022
Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

basic-dataset-generator-from-image-of-numbers generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv inpu

1 Jan 01, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 03, 2023
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 04, 2023
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
Deskewing images with slanted content

skew_correction De-skewing images with slanted content by finding the deviation using Canny Edge Detection. To Run: In python 3.6, from deskew import

13 Aug 27, 2022
How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

Real Time Object Recognition From your Screen Desktop . In this post, I will explain how to build a simply program to detect objects from you desktop

Ruslan Magana Vsevolodovna 2 Sep 28, 2022
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

sushant097 224 Jan 07, 2023
Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

Visual Behavior 86 Dec 28, 2022
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 285 Dec 08, 2022
An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

JunHong 1 Nov 05, 2021
Semantic-based Patch Detection for Binary Programs

PMatch Semantic-based Patch Detection for Binary Programs Requirement tensorflow-gpu 1.13.1 numpy 1.16.2 scikit-learn 0.20.3 ssdeep 3.4 Usage tar -xvz

Mr.Curiosity 3 Sep 02, 2022
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr

Eric Ihli 311 Dec 24, 2022
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images.

Welcome This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images. Installation There are curren

8 Jul 29, 2022