hocr

Here are 35 public repositories matching this topic...

UglyToad / PdfPig

Read and extract text and other content from PDFs in C# (port of PDFBox)

pdf csharp pdfbox netstandard pdf-files pdf-document pdf-generation hocr document-analysis pdf-extractor alto-xml page-xml layout-analysis pdf-document-processor

Updated Jul 1, 2024
C#

manisandro / gImageReader

Star

A Gtk/Qt front-end to tesseract-ocr.

c-plus-plus gtk qt ocr scanner tesseract-ocr pdf-document hocr hocr-documents

Updated Jul 4, 2024
C++

mittagessen / kraken

Star

OCR engine for all the languages

ocr neural-networks hocr optical-character-recognition htr handwritten-text-recognition alto-xml page-xml layout-analysis

Updated Jul 3, 2024
Python

BobLd / DocumentLayoutAnalysis

Sponsor

Star

Document Layout Analysis resources repos for development with PdfPig.

pdf csharp hocr tei hocr-documents alto-xml table-extraction page-xml alto layout-analysis document-layout-analysis xycut docstrum pdfpig xy-cut recursive-xy-cut page-segmentation

Updated Oct 1, 2023
C#

UB-Mannheim / ocr-fileformat

Star

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr validation transformation hocr finereader page-xml alto ocr-d

Updated Jul 4, 2024
JavaScript

filak / hOCR-to-ALTO

Star

Convert between Tesseract hOCR and ALTO XML using XSL stylesheets

hocr xsl alto xslt2 xsl-stylesheets

Updated Jul 4, 2024
XSLT

dbmdz / mirador-textoverlay

Star

Text Overlay plugin for Mirador 3

ocr hocr optical-character-recognition iiif mirador-plugins alto-xml mirador alto mirador-3

Updated Jun 7, 2024
JavaScript

UB-Mannheim / ocr-gt-tools

Star

Ergonomic line-by-line transcription of scanned text.

ocr web-interface hocr transcription ground-truth

Updated Dec 16, 2020
JavaScript

GeReV / hocr-editor-ts

Star

A visual hOCR file editor

ocr tesseract-ocr hocr hocr-documents

Updated Apr 3, 2024
TypeScript

cneud / ocr-conversion

Star

Conversions between various OCR formats

ocr hocr tei-xml alto-xml page-xml abbyy-xml

Updated May 13, 2023

trufanov-nok / tesseract2djvused

Star

A simple Tesseract 3.02+ hOCR to djvused format converter written in Qt

ocr tesseract djvu hocr

Updated Aug 8, 2021
C++

macabeus / pyslibtesseract

Star

✏️ Integration of Tesseract for Python using a shared library

ocr tesseract hocr

Updated Mar 25, 2016
Python

dmi3kno / hocr

Star

Text-to-tibble

r ocr tesseract rstats tesseract-ocr hocr hocr-documents tibble

Updated Apr 25, 2020
R

fakabbir / OCR

Star

Probabilistic Key Value pair extraction using word weights from Invoices - Non Searchable PDF

ocr tesseract python3 hocr

Updated Jun 12, 2021
Python

nuxeo-sandbox / nuxeo-platform-hocr

Star

Perform OCR on images within Nuxeo with Tesseract and hOCR

image ocr text document hocr

Updated Feb 4, 2020
Java

hadro / new-york-city-directories

Star

Some basic data and text extraction from the New York City Directories

ocr brooklyn digital-humanities hocr pdfs manhattan nypl new-york-city-directories

Updated Jun 19, 2017

jlieth / hocr-parser

Star

Python parser for hOCR files using lxml

python ocr hocr parsing-library hocr-documents

Updated Aug 23, 2020
Python

ZeinabTaghavi / Handwriting_Manuscript_Line_and_Segment_Setection_Then_Storage

Star

python opencv histogram projection segmentation manuscript hocr handwriting line-detection segment-detection

Updated Dec 8, 2019
Python

GeReV / HocrEditor

Star

A visual editor for .hocr files.

ocr tesseract-ocr hocr hocr-documents

Updated Nov 18, 2022
C#

mikeduglas / tesseract

Star

tesseract OCR for Clarion

ocr tesseract hocr clarion

Updated Jan 19, 2024
Clarion

Improve this page

Add a description, image, and links to the hocr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hocr topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hocr

Here are 35 public repositories matching this topic...

UglyToad / PdfPig

manisandro / gImageReader

mittagessen / kraken

BobLd / DocumentLayoutAnalysis

UB-Mannheim / ocr-fileformat

filak / hOCR-to-ALTO

dbmdz / mirador-textoverlay

UB-Mannheim / ocr-gt-tools

GeReV / hocr-editor-ts

cneud / ocr-conversion

trufanov-nok / tesseract2djvused

macabeus / pyslibtesseract

dmi3kno / hocr

fakabbir / OCR

nuxeo-sandbox / nuxeo-platform-hocr

hadro / new-york-city-directories

jlieth / hocr-parser

ZeinabTaghavi / Handwriting_Manuscript_Line_and_Segment_Setection_Then_Storage

GeReV / HocrEditor

mikeduglas / tesseract

Improve this page

Add this topic to your repo