OCR comparison tool

Welcome to this repository which provides you with a complete-standalone, automated tool to compare the performance of different OCR services.

This tool was developed to compare the performance of Amazon's, Google's and Microsoft's text detection in a variety of images from: handdrawn characters and words to 'live scene' photographs. The blog can be found here

Disclaimer

This project was developed using:

python 3.7.4

python modules version as described in requirements.txt

CharacTER.py released on 27/06/2019

Software versions are subject to change with new releases, to ensure the project runs smoothly without alteration the above versions should be used. This software was last ran on 14/10/2019

Introduction

This tool is fully automated to generate images' transcriptions from disk, pass them, one-by-one, into each supported OCR service and generate meaningful metrics.

This tool makes use of the command-line interface (CLI) to operate.

The tool currrently supports the following OCR services:

Amazon

Textract is used for detecting document text
Rekognition is used for detecting live scene text

Google

Vision is used for detecting document and live scene text

Microsoft

Computer Vision is used for detecting document and live scene text

Getting Started

Following the instructions below will enable you to use the tool for comparing your own images.

Prerequisites

The following need to be setup before using this tool.

Amazon

Follow the steps in this guide to create an account and setup a user
Follow steps 2-4 in this guide to generate your account's key

Google

Follow the steps in this guide to create a billing activated account
Follow the steps in this guide to enabled Google Vision for a Google Cloud Project

Microsoft

Follow the steps in this guide to create an account and link a congitive service resource to it
Create a secret file, e.g.


vi /path/to/directory/.ms/credentials.txt

Follow the step 'Get the keys from you resource' in this guide and store this in the secret file (replace the placeholder key value with your account's key)


{
    "key": "XXXXXXXXXXXXXXX00XXX"
}

Optional: It is recommended that you store your service/access keys in a secret '.' file.


mv /path/to/saved/credentials.txt /path/to/file/.secret_file.txt

You will need the pathways to these keys in future steps

Installation

To install this tool to your local machine for comparison purposes, follow the instructions below.

Clone this repo to your local machine


git clone <HTTPS URL>/ocr_comparison_tool.git

Move into the ocr_comparison_tool directory


cd /path/to/cloned/directory/ocr_comparison_tool/

2.5. Optional: Create a python3 virtual environment


python3 -m venv .

then


. bin/activate

Install the required python libraries


pip3 install -r requirements.txt

Configuration

To configure the OCR services for this tool, follow the steps below.

Amazon

In ./ocr_settings/amazon_settings.py change the placeholder paths to your specific secret files:


environ['AWS_SHARED_CREDENTIALS_FILE']='/path/to/your/secret/credential/.file.txt'
environ['AWS_CONFIG_FILE']='/path/to/your/secret/config/.file.txt'

Google

In ./ocr_settings/google_settings.py change the placeholder path to your specific secret file:


environ['GOOGLE_APPLICATION_CREDENTIALS']='/path/to/your/secret/credential/.file.json'

Microsoft

In ./ocr_settings/microsoft_settings.py change the placeholder path to your specific secret file:


MICROSOFT_ACCESS_CREDENTIALS='/path/to/your/secret/credential/.file.json'

CharacTER

In ./ocr_settings/gateway_settings.py change the placeholder path to your specific CharacTER.py file:


environ['CHARACTER_SCRIPT_PATH']='/path/to/script/CharacTER.py'

Operation

Constraints

images must .jpg or .png format
images must be at least 50 x 50pxl

This tool supports a variety of ways to process images and their transcripts:

run using directory
run using single image (transcript auto-generated)
run using single image (transcript provided)
define images' properties filename
define type of OCR transcript

Run using directory


python3 /path/to/ocr_comparison_tool/cmd.py --dir /path/to/entry_dir

Note: For this option, entry_dir must adhere to the following structure:


entry_dir 
├── props.csv       # properties for images
├── ogl/            # original transcripts*
├── res/            # apis' transcripts*
├── met/            # CharacTER metric scores*
└── imgs/           # images to be transcribed
    ├── img1.jpg
    ├── img2.png
    ├──    .
    ├──    .
    ├──    .
    └── imgn.jpg

Directories: ogl, res and met are optional* as they are generated by the tool.

Run using single image (transcript auto-generated)


python3 /path/to/ocr_comparison_tool/cmd.py --img /path/to/image.jpg

Note: This command auto-generates the original transcript and so assumes that props.csv is located within the current working directory

Run using single image (transcript provided)


python3 /path/to/ocr_comparison_tool/cmd.py --ogl /path/to/transcript.txt --img /path/to/image.jpg

Define images' properties filename


python3 /path/to/ocr_comparison_tool/cmd.py --prp properties.csv

Note: The property file must be located in the current working directory

Define type of OCR transcript


python3 /path/to/ocr_comparison_tool/cmd.py --med [image/document/both]

Note: Changing the media invokes only the models for that type

Authors

Applied Innovation - Kainos

Acknowledgments

CharacTER metrics

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ocr_apis		ocr_apis
ocr_settings		ocr_settings
utils		utils
README.md		README.md
cmd.py		cmd.py
extract.py		extract.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR comparison tool

Disclaimer

Introduction

Amazon

Google

Microsoft

Getting Started

Prerequisites

Amazon

Google

Microsoft

Installation

Configuration

Amazon

Google

Microsoft

CharacTER

Operation

Constraints

Run using directory

Run using single image (transcript auto-generated)

Run using single image (transcript provided)

Define images' properties filename

Define type of OCR transcript

Authors

Acknowledgments

About

Releases

Packages

Languages

KainosSoftwareLtd/ocr-comparison-tool

Folders and files

Latest commit

History

Repository files navigation

OCR comparison tool

Disclaimer

Introduction

Amazon

Google

Microsoft

Getting Started

Prerequisites

Amazon

Google

Microsoft

Installation

Configuration

Amazon

Google

Microsoft

CharacTER

Operation

Constraints

Run using directory

Run using single image (transcript auto-generated)

Run using single image (transcript provided)

Define images' properties filename

Define type of OCR transcript

Authors

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages