Skip to content

Web framework to convert 7/12 Extract pdf using OCR to editable excel file. TEAM NAME: X Æ A-3

Notifications You must be signed in to change notification settings

JayJhaveri1906/Saath-Baara-Utara-OCR-The-7-12-OCR

Repository files navigation

Saath Baara Utara OCR, The 7/12 OCR

By X Æ A-3

Windows sed

Hosted Link:

http://caaca1b72e90.ngrok.io/OCR712/ [ please use the choose file button, hosting was causing issues with drag and drop :) ] (link is down)

This projectis aimed at helping the Global Parli Foundation in their mission to improve rural India through a replicable model of Rural Rejuvenation.

In order to achieve this we aim to

  • build a hosted OCR web service
  • which converts the 7/12 extract (“Saath Baara Utara”)
  • to an editable excel file.

Winning project of the Code For Change Hackathon

Detailed Documentation in ppt

PPT

Basic Prototype

Final Design

Our Features!

  • Upload singular or multiple Saath Baara Utara pdfs at once using drag and drop or browse.
  • Google ocr converts each of these files to a text document in devnagri script
  • Using pandas and python we extract usefull information from these converted text. Basically, we extract variables from the text documents and create columns for a excel file based on them for easy readability and comparision between multiple 7/12 Extracts.
  • FINAL EXCEL

Advantages!

  • In our opinion the biggest advantage we provide is the multiple file support => EXCEL as this will be helpfull for the NGO to compare thousands of 7/12 extracts at a glance using excel functionalities.
  • Our next advantage would be fast and reliable OCR service, with a error rate of only 1.2 %.
  • No one in the market currently provide all these functionalities bundled together.

Basic Steps to Run the Code:

clone the repo
pip install -r requirements.txt

Refer requirements_NOTE.txt

python manage.py migrate
python manage.py makemigrations
python manage.py runserver

Run at http://127.0.0.1:8000/OCR712/

About

Web framework to convert 7/12 Extract pdf using OCR to editable excel file. TEAM NAME: X Æ A-3

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published