Super OCR

This is a Flask project which extracts OCR(Optical Character Recognition) data and Exif (Exchangeable image file format) data from uploaded images.

Users upload the images and the data is returned as a JSON response.

For the OCR data, the project uses Google's Tesseract-OCR Engine to recognise and read the text embedded in images, a form of optical character recognition (OCR).

The techstack used comprises the following:

Python 3(>=3.5)
Flask
Google's Tesseract-OCR Engine
pytesseract, a Python wrapper to Google's Tesseract-OCR Engine.
Skeleton CSS(http://getskeleton.com/) - for the basic styling
Docker
pytest as the unit testing framework
CircleCI for running the tests
Gunicorn as the Python WSGI app server

Workflow

The following is a breakdown of the logic of how the system works when a user uploads a file.

We perform some validations to ensure whatever is uploaded is what we want. These validations include:
- Ensure a file is present in the data uploaded.
- Ensure that file is a valid image file
- Ensure the file is less than or equal to our allowed limit(5MB)
If any of these validations fail, the processing terminates and we return an appropriate error response to the user.
If the validation is success we save the image in the file system and pass its absolute path to the OCR Engine and the Exif Data Extractor for processing.
- The OCR engine recognise and read the text embedded in image(if any) and returns it as dictionary.
- Exif Data Extractor extracts all exif metadata(if any) and returns it as a dictonary
The extracted OCR and Exif metadata are combined and returned as a JSON object to the user.

Here is the workflow as a digram:

Sample response

POST / with image file

{
    "image_text": "No text detected in image",
    "metadata": {
        "ApertureValue": [
            25260,
            10000
        ],
        "BrightnessValue": [
            0,
            0
        ],
        "ColorSpace": 1,
        "ComponentsConfiguration":"",
        "DateTime": "2014:11:07 13:49:58",
        "DateTimeDigitized": "2014:11:07 13:49:58",
        "DateTimeOriginal": "2014:11:07 13:49:58",
        "DigitalZoomRatio": [
            100,
            100
        ],
        "ExifImageHeight": 1664,
        "ExifImageWidth": 2496,
        "ExifOffset": 243,
        "ExifVersion": "0220",
        "ExposureBiasValue": [
            0,
            0
        ],
        "ExposureMode": 0,
        "ExposureProgram": 0,
        "ExposureTime": [
            292,
            10000
        ],
        "FNumber": [
            24,
            10
        ],
        "Flash": 0,
        "FlashPixVersion": "0100",
        "FocalLength": [
            439,
            100
        ],
        "GPSInfo": {
            "0":"��",
            "7": [
                [
                    10,
                    1
                ],
                [
                    49,
                    1
                ],
                [
                    57,
                    1
                ]
            ],
            "29": "2014:11:07"
        },
        "GainControl": 0,
        "ISOSpeedRatings": 100,
        "ImageDescription": "Jpeg",
        "ImageLength": 1664,
        "ImageWidth": 2496,
        "LightSource": 0,
        "Make": "Intel",
        "MaxApertureValue": [
            24,
            10
        ],
        "MeteringMode": 0,
        "Model": "BT210",
        "Orientation": 6,
        "ResolutionUnit": 2,
        "SceneCaptureType": 0,
        "Sharpness": 0,
        "ShutterSpeedValue": [
            -2147483648,
            10000
        ],
        "Software": "Android",
        "SubjectDistance": [
            0,
            1
        ],
        "UserComment":"ASCII  ",
        "WhiteBalance": 0,
        "XResolution": [
            72,
            1
        ],
        "YCbCrPositioning": 1,
        "YResolution": [
            72,
            1
        ]
    }
}

image_text is the OCR detected text while metadata is the Exif metadata extracted from the uploaded image.

If there is no text detected by OCR, image_text will be No text detected in image while if there is no Exif metadata, metadata will be an empty dictionary.

Using the Docker Image

The project exists as a ready to use docker image on Docker hub.

Pull it locally:

docker pull donmanyo/superocr

You can run it locally on PORT 7777 like so, or on any other PORT of your choice:

docker run -p 7777:5000 donmanyo/superocr

You can visit the homepage in the browser http://localhost:7777/ which will display a simple upload form. Use this to test the functionality.

OR better still, you can open the same in Postman and upload the file. Make the image file has the key file.

Local setup

You can also clone the repo from Github and run the project locally without Docker. The project uses

Clone the repo:

git clone https://github.com/msomierick/superocr.git

Create an uploads folder in the instance dir to store the uploaded images.

Create a virtual environment and install the dependencies(Use Python>=3.5):

python3 -m venv venv
. venv/bin/activate
pip install -r requirements.txt

Start the Flask project thus:

python3 manage.py

Visit http://localhost:5000 either with Postman or a browser to it working.

Running unit tests

The project uses pytest as the testing framework and Coverage for generating the test coverage reports.

Run the tests thus, within the cloned repo root folder and the virtual environment activated:

pytest

You can view the Coverage HTML test report by open htmlcov/index.html in a browser of your choice.

The project is also integrated with CircleCI for running the tests. The CircleCI tests can be viewed here.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.circleci		.circleci
app		app
instance		instance
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
Dockerfile		Dockerfile
Procfile		Procfile
README.md		README.md
manage.py		manage.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
runtime.txt		runtime.txt
ubuntu_requirements.txt		ubuntu_requirements.txt
workflow.png		workflow.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Super OCR

Workflow

Sample response

Using the Docker Image

Local setup

Running unit tests

About

Releases

Packages

Languages

msomierick/superocr

Folders and files

Latest commit

History

Repository files navigation

Super OCR

Workflow

Sample response

Using the Docker Image

Local setup

Running unit tests

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages