OCR-D-compliant page segmentation

Related tags

Computer Visionocr-d
Overview

ocrd_segment

This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation.

Installation

In your virtual environment, run:

pip install .

Usage

  • exporting page images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with region polygon coordinates and metadata, also MS-COCO:
  • exporting region images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with region polygon coordinates and metadata:
  • exporting line images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with line polygon coordinates and metadata:
  • importing layout segmentations from other formats (mask images, MS-COCO JSON annotation):
  • repairing layout segmentations (input file groups N >= 1, based on heuristics implemented using Shapely):
  • comparing different layout segmentations (input file groups N = 2, compute the distance between two segmentations, e.g. automatic vs. manual):
  • pattern-based segmentation (input file groups N=1, based on a PAGE template, e.g. from Aletheia, and some XSLT or Python to apply it to the input file group)
    • ocrd-segment-via-template 🚧 (unpublished)
  • data-driven segmentation (input file groups N=1, based on a statistical model, e.g. Neural Network)
    • ocrd-segment-via-model 🚧 (unpublished)

For detailed description on input/output and parameters, see ocrd-tool.json

Testing

None yet.

Comments
  • Processor segment-repair end with Exception

    Processor segment-repair end with Exception

    The processor 'segment-repir' ends wirh Exception "Exception: ocrd-segment-repair exited with non-zero return value 1" if it comes after processor 'cis-ocropy-segment' in the workflow. In a changed workflow.

    In a modified workflow, where processor 'cis-ocropy-segment' is replaced by processor 'tesserocr-segment-line', the processing runs.

    opened by j-panzer 7
  • Conversion Error

    Conversion Error

    When using the ocrd-segment-repair, I've encountered the following Error

    10:30:48.758 INFO processor.RepairSegmentation - Sanitizing region "region0071"
    /home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
      out=out, **kwargs)
    /home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
      ret = ret.dtype.type(ret / rcount)
    Traceback (most recent call last):
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/bin/ocrd-segment-repair", line 8, in <module>
        sys.exit(ocrd_segment_repair())
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 764, in __call__
        return self.main(*args, **kwargs)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 717, in main
        rv = self.invoke(ctx)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
        return callback(*args, **kwargs)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/cli.py", line 13, in ocrd_segment_repair
        return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd/decorators.py", line 60, in ocrd_cli_wrap_processor
        run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd/processor/base.py", line 57, in run_processor
        processor.process()
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/repair.py", line 88, in process
        self.sanitize_page(page, page_id)
      File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/repair.py", line 202, in sanitize_page
        scale = int(np.median(np.array(heights)))
    ValueError: cannot convert float NaN to integer
    

    The Region in question belongs to a Newspaper Digitalization.

    It is possible to workaround at line 202 in repair.py (please see above) with with a check like

                _median = np.median(np.array(heights))
                if not np.isnan(_median):
                    scale = int(_median)
                else:
                    scale = 1
    

    which finally yields at the same place

    10:48:47.496 INFO processor.RepairSegmentation - Sanitizing region "region0071"
    /home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
      out=out, **kwargs)
    /home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
      ret = ret.dtype.type(ret / rcount)
    10:48:47.699 WARNING processor.RepairSegmentation - Zero contour area in region "region0071"
    
    

    but this way the processing move further.

    opened by M3ssman 5
  • change default output filegroup for `ocrd-segment-replace-original`

    change default output filegroup for `ocrd-segment-replace-original`

    The default output filegroup for ocrd-segment-replace-original is set to OCR-D-IMG-CROP which already exists in the majority of METS-files. Would be great to change the default value, so a user is not forced to specify it on his own (as for the other processors this is purely optional). @kba suggested, that changing the default value might actually not be necessary, if we could drop the rule for two output filegroups also for this processor.

    opened by EEngl52 4
  • sanitize: stay on page image/array

    sanitize: stay on page image/array

    Fixes #21 – it was not correct to use the region image/array here, because that depends on the bounding box of the region, which can be too small.

    Something not covered by this is when TextLine coordinates even extrude the page Border.

    opened by bertsky 4
  • Expand regions via repair/sanitize

    Expand regions via repair/sanitize

    Before samitization: image

    Regions are often too small and do not span the lines they (should) contain.

    After sanitization: image

    Situation is not much better, although

    $ ocrd-segment-repair -J
    ...
    "sanitize": {
       "type": "boolean",
       "default": false,
       "description": "Shrink and/or expand a region in such a way that it coordinates include those of all its lines"
      }
    ...
    

    Expansion does not work, is not complete.

    bug 
    opened by wrznr 4
  • Add the basic project layout and minimal functionality

    Add the basic project layout and minimal functionality

    This is supposed to be an OCR-D processor which someday will give plausibility feedback on a page's segmentation. It uses https://pypi.org/project/Shapely/ as proposed by @bertsky.

    opened by wrznr 4
  • ocrd-segment-extract-lines ignores lines with ">

    ocrd-segment-extract-lines ignores lines with "\n" in

    I have used ocrd-segment-extract-lines with a PAGE file, which has had some <TextLine> with a "\n" in the <Unicode>area. Unfortunately, for these lines the extraction is not done.

    This example works ok:```

        <pc:TextEquiv>
          <pc:Unicode>1889</pc:Unicode>
        </pc:TextEquiv>
    
    This example does not work:
    
        <pc:TextEquiv>
          <pc:Unicode>1889
          </pc:Unicode>
        </pc:TextEquiv>
    
    
    ==> please clarify ...
    opened by stefanCCS 3
  • Processor ocrd-segment-repair exits with exception

    Processor ocrd-segment-repair exits with exception

    Log output:

    12:06:01.982 INFO ocrd.task_sequence.run_tasks - Start processing task 'segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -p '{"plausibilize": true, "sanitize": false, "plausibilize_merge_min_overlap": 0.9}''
    Traceback (most recent call last):
      File "/venv-20200919/bin/ocrd", line 8, in <module>
        sys.exit(cli())
      File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/venv-20200919/lib/python3.7/site-packages/ocrd/cli/process.py", line 28, in process_cli
        run_tasks(mets, log_level, page_id, tasks, overwrite)
      File "/venv-20200919/lib/python3.7/site-packages/ocrd/task_sequence.py", line 149, in run_tasks
        raise Exception("%s exited with non-zero return value %s. STDOUT:\n%s\nSTDERR:\n%s" % (task.executable, returncode, out, err))
    Exception: ocrd-segment-repair exited with non-zero return value 1. STDOUT:
    
    STDERR:
    12:06:02.420 INFO processor.RepairSegmentation - INPUT FILE 0 / PHYS_0001
    12:06:02.423 INFO ocrd.page_validator - Validating input file 'FILE_0001_OCR-D-SEG-REG'
    12:06:02.439 INFO processor.RepairSegmentation - INPUT FILE 1 / PHYS_0002
    12:06:02.440 INFO ocrd.page_validator - Validating input file 'FILE_0002_OCR-D-SEG-REG'
    Traceback (most recent call last):
      File "/venv-20200919/local/sub-venv/headless-tf1/bin/ocrd-segment-repair", line 8, in <module>
        sys.exit(ocrd_segment_repair())
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/cli.py", line 16, in ocrd_segment_repair
        return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/decorators.py", line 102, in ocrd_cli_wrap_processor
        run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 69, in run_processor
        processor.process()
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 94, in process
        parents = list(set([region.parent_object_ for region in page.get_AllRegions(classes=['Text'])]))
      File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_models/ocrd_page_generateds.py", line 2905, in __hash__
        return hash(self.id)
    AttributeError: 'PageType' object has no attribute 'id'
    
    opened by stweil 3
  • Repair fix coords

    Repair fix coords

    This attempts to fix problems caused by invalid polygons from ocrd-segment-repair (both in sanitize and plausibilize mode).

    This taught me another lesson about what can go wrong with Shapely / numpy / PAGE interaction. To sum up:

    1. Ensuring valid polygons on the input side (e.g. from OpenCV) is always necessary. The only generic way I can think of is to feed them through simplify with ever increasing tolerance until valid. EDIT2 The problem is that the result of the algorithm implemented in Shapely/GEOS depends on the starting point it picked. In pathological cases, no simplification whatsoever can be achieved. (The only thing that then helps is re-ordering...)
    2. Operations like union or intersection can create collections of shapes. EDIT There are actually 2 cases here:
      1. homogeneous (MultiPolygon) – a discontiguous collection of Polygon – in which case one needs the convex hull.
      2. heterogeneous (GeometryCollection) – a collection of Polygon with Point or LineString – in which case one needs to filter out those shapes which have no intrinsic area (and then check again for the other cases)
    3. Operations like union or intersection can create non-integer points, which when rounded for PAGE serialization can become invalid paths. Unfortunately, Shapely always calculates in floating point internally. So all we can do is rounding and then ensuring validity (as in 1).

    Related:

    • https://github.com/cisocrgroup/ocrd_cis/issues/67
    • https://github.com/cisocrgroup/ocrd_cis/issues/62
    • https://github.com/OCR-D/ocrd_tesserocr/issues/149
    • https://github.com/OCR-D/ocrd_tesserocr/issues/151
    opened by bertsky 3
  • Update README, only announce features that are acutally provided

    Update README, only announce features that are acutally provided

    README announces ocrd-segment-via-template and ocrd-segment-via-model – none of which are actually provided by this package. It does provide some ocrd-segment-extract-* features; these do not do any segmentation though (or I could not find out how).

    opened by dariok 3
  • cannot upload to pypi anymore

    cannot upload to pypi anymore

    The change https://github.com/OCR-D/ocrd_segment/commit/c8756272caf900febe7166f8bed5d20713f002cf depends on https://github.com/ppwwyyxx/cocoapi/pull/7, an addition of mine to the current pycocotools version 2.0.3 on PyPI. Such git URL references are allowed in requirements.txt / setuptools, but the PyPI server refuses taking such builds:

    Invalid value for requires_dist. Error: Can't have direct dependency: 'pycocotools @ git+https://github.com/bertsky/pycocotools#subdirectory=PythonAPI'
    

    @kba, do you know what to do under such circumstances?

    opened by bertsky 2
  • Build fails for MacOS (ocrd-fork-pycocotools)

    Build fails for MacOS (ocrd-fork-pycocotools)

    Running make all for ocrd_all or pip install . for ocrd_segment fails on MacOS with Homebrew:

          Compiling pycocotools/_mask.pyx because it changed.
          [1/1] Cythonizing pycocotools/_mask.pyx
          /private/var/folders/wf/g2hmm5bd72v2r_p0r1smct_00000gn/T/pip-install-0xp4jh31/ocrd-fork-pycocotools_7b0159a305264f708a622a0e4daa80bd/.eggs/Cython-3.0.0a11-py3.9.egg/Cython/Compiler/Main.py:345: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /private/var/folders/wf/g2hmm5bd72v2r_p0r1smct_00000gn/T/pip-install-0xp4jh31/ocrd-fork-pycocotools_7b0159a305264f708a622a0e4daa80bd/pycocotools/_mask.pyx
            tree = Parsing.p_module(s, pxd, full_module_name)
          building 'pycocotools._mask' extension
          creating build/common
          creating build/temp.macosx-12-arm64-cpython-39
          creating build/temp.macosx-12-arm64-cpython-39/common
          creating build/temp.macosx-12-arm64-cpython-39/pycocotools
          clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -I/OCR-D/venv-20221112/lib/python3.9/site-packages/numpy/core/include -I./common -I/OCR-D/venv-20221112/include -I/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c ../common/maskApi.c -o build/temp.macosx-12-arm64-cpython-39/../common/maskApi.o -Wno-cpp -Wno-unused-function -std=c99
          clang: error: no such file or directory: '../common/maskApi.c'
          clang: error: no input files
          error: command '/usr/bin/clang' failed with exit code 1
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
      ERROR: Failed building wheel for ocrd-fork-pycocotools
    
    opened by stweil 7
  • Error in shapely/ocrd_segment

    Error in shapely/ocrd_segment

    The segment-repair processor in the following workflow:

    ocrd process \
    "olena-binarize -I OCR-D-IMG -O OCR-D-BIN -P impl sauvola" \
    "anybaseocr-crop -I OCR-D-BIN -O OCR-D-CROP" \
    "olena-binarize -I OCR-D-CROP -O OCR-D-BIN2 -P impl kim" \
    "cis-ocropy-denoise -I OCR-D-BIN2 -O OCR-D-BIN-DENOISE -P level-of-operation page" \
    "cis-ocropy-deskew -I OCR-D-BIN-DENOISE -O OCR-D-BIN-DENOISE-DESKEW -P level-of-operation page" \
    "tesserocr-segment-region -I OCR-D-BIN-DENOISE-DESKEW -O OCR-D-SEG-REG" \
    "segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -P plausibilize true" \
    "cis-ocropy-deskew -I OCR-D-SEG-REPAIR -O OCR-D-SEG-REG-DESKEW -P level-of-operation region" \
    "cis-ocropy-clip -I OCR-D-SEG-REG-DESKEW -O OCR-D-SEG-REG-DESKEW-CLIP -P level-of-operation region" \
    "tesserocr-segment-line -I OCR-D-SEG-REG-DESKEW-CLIP -O OCR-D-SEG-LINE" \
    "segment-repair -I OCR-D-SEG-LINE -O OCR-D-SEG-REPAIR-LINE -P sanitize true" \
    "cis-ocropy-dewarp -I OCR-D-SEG-REPAIR-LINE -O OCR-D-SEG-LINE-RESEG-DEWARP" \
    "calamari-recognize -I OCR-D-SEG-LINE-RESEG-DEWARP -O OCR-D-OCR -P checkpoint_dir qurator-gt4histocr-1.0"
    

    executed on the DEFAULT file group inside this workspace: https://content.staatsbibliothek-berlin.de/dc/PPN631277528.mets.xml

    produces the following error:

      12:45:52.522 INFO processor.RepairSegmentation - INPUT FILE 0 / PHYS_0001
      12:45:52.524 INFO ocrd.page_validator.validate - Validating input file 'FILE_0001_OCR-D-SEG-LINE'
      12:45:52.652 INFO processor.RepairSegmentation - INPUT FILE 1 / PHYS_0002
      12:45:52.654 INFO ocrd.page_validator.validate - Validating input file 'FILE_0002_OCR-D-SEG-LINE'
      12:45:52.776 INFO processor.RepairSegmentation - INPUT FILE 2 / PHYS_0003
      12:45:52.777 INFO ocrd.page_validator.validate - Validating input file 'FILE_0003_OCR-D-SEG-LINE'
      12:45:52.912 INFO processor.RepairSegmentation - INPUT FILE 3 / PHYS_0004
      12:45:52.914 INFO ocrd.page_validator.validate - Validating input file 'FILE_0004_OCR-D-SEG-LINE'
      12:45:53.017 INFO processor.RepairSegmentation - INPUT FILE 4 / PHYS_0005
      12:45:53.019 INFO ocrd.page_validator.validate - Validating input file 'FILE_0005_OCR-D-SEG-LINE'
      12:45:53.026 WARNING processor.RepairSegmentation - Fixed CoordinateValidityError for SeparatorRegion 'region0011'
      12:45:53.027 WARNING processor.RepairSegmentation - Fixed CoordinateValidityError for SeparatorRegion 'region0012'
      12:45:53.119 WARNING processor.RepairSegmentation - Zero contour area in region "region0000"
      12:45:53.730 WARNING processor.RepairSegmentation - Zero contour area in region "region0011"
      12:45:53.734 WARNING processor.RepairSegmentation - Zero contour area in region "region0012"
      12:45:54.609 INFO processor.RepairSegmentation - INPUT FILE 5 / PHYS_0006
      12:45:54.610 INFO ocrd.page_validator.validate - Validating input file 'FILE_0006_OCR-D-SEG-LINE'
      12:45:54.708 INFO processor.RepairSegmentation - INPUT FILE 6 / PHYS_0007
      12:45:54.710 INFO ocrd.page_validator.validate - Validating input file 'FILE_0007_OCR-D-SEG-LINE'
      12:45:54.812 WARNING processor.RepairSegmentation - Zero contour area in region "region0003"
      12:45:55.186 ERROR shapely.geos - TopologyException: side location conflict at 262 1071. This can occur if the input geometry is invalid.
      Traceback (most recent call last):
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/bin/ocrd-segment-repair", line 8, in <module>
          sys.exit(ocrd_segment_repair())
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
          return self.main(*args, **kwargs)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1055, in main
          rv = self.invoke(ctx)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
          return ctx.invoke(self.callback, **ctx.params)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 760, in invoke
          return __callback(*args, **kwargs)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/cli.py", line 21, in ocrd_segment_repair
          return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/decorators/__init__.py", line 108, in ocrd_cli_wrap_processor
          run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
          processor.process()
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 188, in process
          padding=self.parameter['sanitize_padding'])
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 559, in shrink_regions
          if len(contour) >= 3], scale=scale)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/project.py", line 179, in join_polygons
          jointp = unary_union(polygons)
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/shapely/ops.py", line 161, in unary_union
          return geom_factory(lgeos.methods['unary_union'](collection))
        File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/shapely/geometry/base.py", line 73, in geom_factory
          raise ValueError("No Shapely geometry can be created from null value")
      ValueError: No Shapely geometry can be created from null value
    

    This is the input image: FILE_0007_DEFAULT

    opened by MehmedGIT 0
  • ocrd-segment-extract-lines - Lines are not extracted, in case they are in an area of other lines

    ocrd-segment-extract-lines - Lines are not extracted, in case they are in an area of other lines

    Hi, I think I have found a bug in ocrd-segment-extract-lines: I cannot prove to 100%, but I think I see my environment, that the lines are not extracted (no images are created), in case a line is somehow graphically (concerning the coordinates) within another line of the same region. I extract only images in this case using this command:

    ocrd-segment-extract-lines -I $infolder -O $extractLineImagesFolder  -P  output-types '[]' -P min-line-length 0 -P min-line-width 5 -P min-line-height 5
    

    Page-Extract: Here the line TR-15_line0002 was not extracted:

        <pc:TextRegion id="TR-15" orientation="0.">
          <pc:AlternativeImage filename="OCR-D-REG-VL-BL/OCR-D-REG-VL-BL_4749_007817786_00183_TR-15.IMG-DESKEW.png" comments=",binarized,deskewed,verticallinesremoved" />
          <pc:Coords points="237,383 237,438 443,438 443,383" />
          <pc:TextLine id="TR-15_line0001">
            <pc:Coords points="237,438 237,383 239,383 253,391 311,391 320,383 349,383 357,390 365,383 384,383 402,391 419,383 427,383 430,418 428,438 302,438 298,435 289,435 284,438" />
            <pc:Baseline points="227,415 430,418" />
          </pc:TextLine>
          <pc:TextLine id="TR-15_line0003">
            <pc:Coords points="261,438 269,433 274,433 295,438" />
            <pc:Baseline points="254,475 295,475" />
          </pc:TextLine>
          <pc:TextLine id="TR-15_line0002">
            <pc:Coords points="385,438 388,435 388,434 409,434 409,438" />
            <pc:Baseline points="343,478 412,475" />
          </pc:TextLine>
        </pc:TextRegion>
    

    Logfile content for this case:

    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.189 WARNING processor.ExtractLines - Line 'TR-14_line0001' contains no text content
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.201 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-14_TR-14_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-14_TR-14_line0001.bin.png
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.242 WARNING processor.ExtractLines - Line 'TR-15_line0001' contains no text content
    2022-08-11_14-21-13-extractlines.log:2022-08-11 14:21:31.255 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0001.bin.png
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.256 WARNING processor.ExtractLines - Line 'TR-15_line0003' contains no text content
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.267 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0003.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0003.bin.png
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.268 WARNING processor.ExtractLines - Line 'TR-15_line0002' contains no text content
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.311 WARNING processor.ExtractLines - Line 'TR-16_line0001' contains no text content
    2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.348 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-16_TR-16_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-16_TR-16_line0001.bin.png
    
    
    opened by stefanCCS 5
  •  ocrd-segment-repair: handle case where points is empty

    ocrd-segment-repair: handle case where points is empty

    Version 0.1.20, ocrd/core 2.33.0

    I have a PAGE file, which does not have any real content - like this:

        <pc:Page imageFilename="OCR-D-IMG/0038_IMAGE000918_00001.tif" imageWidth="1420" imageHeight="2313" orientation="0.">
            <pc:AlternativeImage filename="OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png" comments=",binarized"/>
            <pc:TextRegion id="TR-1" orientation="0.">
                <pc:Coords points=""/>
            </pc:TextRegion>
        </pc:Page>
    

    If I call ocrd-segment-extract-lines, I get an expection like this:

    09:19:19.733 DEBUG ocrd.workspace.image_from_page - page 'P_0038_IMAGE000918_00001' has  orientation=0 skew=0.00
    09:19:19.733 DEBUG ocrd.workspace.image_from_page - Using AlternativeImage 1 {'', 'binarized'} for page 'P_0038_IMAGE000918_00001'
    09:19:19.734 DEBUG ocrd.workspace.download_file - download_file <OcrdFile fileGrp=OCR-D-BIN ID=OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN, mimetype=image/png, url=OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png, local_filename=OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png]/>  [_recursion_count=0]
    09:19:19.735 DEBUG PIL.PngImagePlugin - STREAM b'IHDR' 16 13
    09:19:19.735 DEBUG PIL.PngImagePlugin - STREAM b'IDAT' 41 65536
    Traceback (most recent call last):
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/bin/ocrd-segment-extract-lines", line 8, in <module>
        sys.exit(ocrd_segment_extract_lines())
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1053, in main
        rv = self.invoke(ctx)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1395, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 754, in invoke
        return __callback(*args, **kwargs)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_segment/cli.py", line 65, in ocrd_segment_extract_lines
        return ocrd_cli_wrap_processor(ExtractLines, *args, **kwargs)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/decorators/__init__.py", line 88, in ocrd_cli_wrap_processor
        run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
        processor.process()
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_segment/extract_lines.py", line 171, in process
        transparency=self.parameter['transparency'])
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/workspace.py", line 829, in image_from_segment
        fill=fill, transparency=transparency)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/workspace.py", line 1012, in _crop
        segment_polygon = coordinates_of_segment(segment, parent_image, parent_coords)
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_utils/image.py", line 136, in coordinates_of_segment
        polygon = np.array(polygon_from_points(segment.get_Coords().points))
      File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_utils/image.py", line 148, in polygon_from_points
        polygon.append([float(x_y[0]), float(x_y[1])])
    ValueError: could not convert string to float: 
    
    

    My expection would be, that this PAGE file simply would be ignored. --> please, clarify ...

    opened by stefanCCS 6
  • evaluate: explain/document metrics

    evaluate: explain/document metrics

    If I understand correctly the idea behind these metrics are taken from "rethinking semantic segmentation evaluation" paper, but could you explain to me how could I obtain AP,TPs,FPs,FNs for instance segmentation task?

    Originally posted by @andreaceruti in https://github.com/cocodataset/cocoapi/issues/564#issuecomment-1064223428

    opened by bertsky 1
  • evaluate: false redundant matches if overlaps occur on any side already

    evaluate: false redundant matches if overlaps occur on any side already

    The multi-match overlap algorithm (necessary to calculate over- and undersegmentation) still has a glitch: it will create fake/redundant pairings if either side has a segmentation that already overlaps locally. For example, take a page with a GraphicRegion overlapping multiple TextRegions, and evaluate that against itself: the matching will not only produce the 1:1 pairs, but also other matches. That's probably not what we want.

    opened by bertsky 0
Releases(v0.1.21)
  • v0.1.21(May 27, 2022)

  • v0.1.20(May 27, 2022)

  • v0.1.19(May 27, 2022)

    Changed:

    • repair (sanitize): run on all region types
    • repair (sanitize): add parameter sanitize_padding
    • repair (sanitize): use binary foreground instead of text line coordinates
    • repair (plausibilize): use true alpha shape instead of convex hull
    • project: add level-of-operation=table
    • repair: add option simplify
    • ensure compatibility with Shapely 1.8
    Source code(tar.gz)
    Source code(zip)
  • v0.1.18(Mar 30, 2022)

  • v0.1.17(Mar 30, 2022)

  • v0.1.16(Feb 21, 2022)

  • v0.1.15(Feb 17, 2022)

    Changed:

    • repair: plausibilize: both analyse & apply iff enabled
    • extract-lines: add parameters for output types and conditions for line extraction
    • extract-lines: add xlsx output option for GT editing
    Source code(tar.gz)
    Source code(zip)
  • v0.1.14(Feb 17, 2022)

    Changed:

    • repair: for non-trivial region overlaps, recurse to line level
    • repair: for non-trivial line overlaps, merge (if centric) or subtract
    Source code(tar.gz)
    Source code(zip)
  • v0.1.13(Dec 10, 2021)

    Fixed:

    • evaluate: multi-matching (without pycocotools)

    Changed:

    • evaluate: improved report format (hierarchy and names)

    Added:

    • evaluate: over-/undersegmentation metrics, pixel-wise metrics
    Source code(tar.gz)
    Source code(zip)
  • v0.1.12(Dec 2, 2021)

  • v0.1.11(Mar 23, 2021)

  • v0.1.10(Feb 26, 2021)

    Fixed:

    • extract-regions: apply feature_filter param

    Changed:

    • extract-pages: add feature_filter param
    • extract-pages: add order choice for plot_segmasks
    Source code(tar.gz)
    Source code(zip)
  • v0.1.9(Feb 26, 2021)

  • v0.1.8(Feb 8, 2021)

    Fixed:

    • replace-page: getLogger context

    Changed:

    • extract-words: new
    • extract-glyphs: new
    • extract-pages: expose colordict parameter (w/ same default)
    • extract-pages: multi-level mask output via plot_segmasks
    Source code(tar.gz)
    Source code(zip)
  • v0.1.7(Jan 7, 2021)

  • v0.1.6(Nov 25, 2020)

    Fixed:

    • repair: also fix negative coords, also on page level
    • replace-original: also remove page border/@orientation
    • replace-original: add new original as derived image, too
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(Nov 4, 2020)

    Fixed:

    • evaluate: adapt to zip_input_files in core

    Changed:

    • replace-original: delegate to repair.ensure_consistent
    • replace-page: new CLI (inverse or replace-original)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Nov 4, 2020)

  • v0.1.3(Sep 24, 2020)

  • v0.1.2(Sep 24, 2020)

  • v0.1.1(Sep 24, 2020)

    Changed:

    • repair: traverse all text regions recursively

    Fixed:

    • repair: be robust against invalid input polygons
    • repair: be careful to make valid output polygons
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Aug 21, 2020)

    Changed:

    • adapt to 1-output-file-group convention, use make_file_id and assert_file_grp_cardinality, #41

    Fixed:

    • typo in extract_lines, #40
    Source code(tar.gz)
    Source code(zip)
Owner
OCR-D
DFG-Koordinierungsprojekt zur Weiterentwicklung von Verfahren der Optical Character Recognition
OCR-D
Smart computer vision application

Smart-computer-vision-application Backend : opencv and python Library required:

2 Jan 31, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
Select range and every time the screen changes, OCR is activated.

ASOCR(Auto Screen OCR) Select range and every time you press Space key, OCR is activated. 範囲を選ぶと、あなたがスペースキーを押すたびに、画面が変わる度にOCRが起動します。 usage1: simple OC

1 Feb 13, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
STEFANN: Scene Text Editor using Font Adaptive Neural Network

STEFANN: Scene Text Editor using Font Adaptive Neural Network @ The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.

Prasun Roy 208 Dec 11, 2022
An OCR evaluation tool

dinglehopper dinglehopper is an OCR evaluation tool and reads ALTO, PAGE and text files. It compares a ground truth (GT) document page with a OCR resu

QURATOR-SPK 40 Dec 20, 2022
Crop regions in napari manually

napari-crop Crop regions in napari manually Usage Create a new shapes layer to annotate the region you would like to crop: Use the rectangle tool to a

Robert Haase 4 Sep 29, 2022
TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

zhangjing1 24 Apr 28, 2022
A tool to enhance your old/damaged pictures built using python & opencv.

Breathe Life into your Old Pictures Table of Contents About The Project Getting Started Prerequisites Usage Contact Acknowledgments About The Project

Shah Anwaar Khalid 5 Dec 16, 2021
[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks This is an official PyTorch code repository of the paper "Cloud Transformers:

Visual Understanding Lab @ Samsung AI Center Moscow 27 Dec 15, 2022
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

225 Dec 25, 2022
Multi-choice answer sheet correction system using computer vision with opencv & python.

Multi choice answer correction 🔴 5 answer sheet samples with a specific solution for detecting answers and sheet correction. 🔴 By running the soluti

Reza Firouzi 7 Mar 07, 2022
A fastai/PyTorch package for unpaired image-to-image translation.

Unpaired image-to-image translation A fastai/PyTorch package for unpaired image-to-image translation currently with CycleGAN implementation. This is a

Tanishq Abraham 120 Dec 02, 2022
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

mick.yi 107 Jan 09, 2023
A curated list of promising OCR resources

Call for contributor(paper summary,dataset generation,algorithm implementation and any other useful resources) awesome-ocr A curated list of promising

wanghaisheng 1.6k Jan 04, 2023
Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

Gabriel Lefundes 9 Nov 12, 2021
The papers published in top-tier AI conferences in recent years.

AI-conference-papers The papers published in top-tier AI conferences in recent years. Paper table AAAI ICLR CVPR ICML ICCV ECCV NIPS 2019 ✔️ ✔️ ✔️ ✔️

Jinbae Park 6 Dec 09, 2022
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
Primary QPDF source code and documentation

QPDF QPDF is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption,

QPDF 2.2k Jan 04, 2023
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022