Here are
35 public repositories
matching this topic...
A Gtk/Qt front-end to tesseract-ocr.
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Updated
Jul 4, 2024
JavaScript
OCR engine for all the languages
Updated
Jul 3, 2024
Python
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
Updated
Jul 2, 2024
Python
Read and extract text and other content from PDFs in C# (port of PDFBox)
Text Overlay plugin for Mirador 3
Updated
Jun 7, 2024
JavaScript
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
A visual hOCR file editor
Updated
Apr 3, 2024
TypeScript
TIFF Image - Converted into OCR XML using Tesseract
Updated
Mar 9, 2024
Python
tesseract OCR for Clarion
Updated
Jan 19, 2024
Clarion
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
Updated
Oct 3, 2023
Python
Document Layout Analysis resources repos for development with PdfPig.
Conversions between various OCR formats
OCR engine for all the languages
Updated
Jan 6, 2023
Python
Updated
Jan 4, 2023
JavaScript
A visual editor for .hocr files.
A gem that parses positional text from hOCR output and provides convenience methods to find text.
Updated
Oct 20, 2022
Ruby
A simple Tesseract 3.02+ hOCR to djvused format converter written in Qt
A simple Tesseract 3.02+ hOCR to djvused format converter written in Qt
Probabilistic Key Value pair extraction using word weights from Invoices - Non Searchable PDF
Updated
Jun 12, 2021
Python
Improve this page
Add a description, image, and links to the
hocr
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
hocr
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.